Energy Demand Forecasting : Combining Cointegration Analysis and Artificial Intelligence Algorithm

Energy is vital for the sustainable development of China. Accurate forecasts of annual energy demand are essential to schedule energy supply and provide valuable suggestions for developing related industries. In the existing literature on energy use prediction, the artificial intelligence-based (AI-based) model has received considerable attention. However, few econometric and statistical evidences exist that can prove the reliability of the current AI-based model, an area that still needs to be addressed. In this study, a new energy demand forecasting framework is presented at first. On the basis of historical annual data of electricity usage over the period of 1985–2015, the coefficients of linear and quadratic forms of the AI-based model are optimized by combining an adaptive genetic algorithm and a cointegration analysis shown as an example. Prediction results of the proposed model indicate that the annual growth rate of electricity demand in China will slow down. However, China will continue to demand about 13 trillion kilowatt hours in 2030 because of population growth, economic growth, and urbanization. In addition, the model has greater accuracy and reliability compared with other single optimization methods.


Introduction
Energy, which is a vital input for the economic and social development of any economy, has gained special attention.Combined with globalization and industrialization, global energy demand has been increasing continually for decades and is expected to rise approximately 30% from 2015 to 2035 in accordance with the worldwide economic growth [1].Therefore, energy demand projection should be developed because accurate energy demand forecasts aid policy makers in improving the schedule of energy supply and providing valuable suggestions for planning energy supply system operations.
Given the importance of accurate energy forecasts, extant studies using different estimation methods have been undertaken since the 1970s.In general, these early studies can be classified into two major categories: econometric [2][3][4][5][6][7][8][9] and machine learning (ML) methods [10][11][12][13][14][15][16][17][18][19][20][21][22][23].The artificial intelligence (AI) energy forecasting model, which is a class of ML method, has gained popularity in recent years because of its superiority in time series processing and its capability to deal with noise data.Several tools, such as artificial neural networks (ANN), genetic algorithm (GA), ant colony optimization (ACO), and particle swarm optimization, are commonly employed in the model [10][11][12][13][14][15][16][17].Compared with the conventional econometric energy forecasting method, the AI-based model frequently demonstrates higher prediction accuracy in terms of mean absolute error (MAE), mean square error (MSE), mean absolute percentage error (MAPE), and root mean square error (RMSE) [16,17].According to economic theories, the model is feasible for predicting future energy demand by using the historical relationship when the periodical characteristics between energy demand and its explanatory variables will not change in the long term.However, the current AI-based method is referred to as the "black-box" because it predicts energy demand without knowing the internal relationship between energy demand and its affecting factors [23].In addition, few econometric and statistical evidences are found that can prove the relationship between energy demand and its factors.This relationship may change in the long run based on the current AI-based model.
This study aims to present a more scientific AI-based energy demand forecasting framework that ensures the reliability of predicted results.The electricity demand of China is forecasted as an example to show the process of implementing this framework.In addition, the predicted results are beneficial for policy makers to perform appropriate measures to bridge the electricity gap and arrange the supply of electricity demand.
The rest of the paper is organized as follows.Section 2 conducts a detailed literature review on the recent developments of energy demand forecasting.Section 3 presents the new framework.Section 4 predicts the electricity demand in China for 2016-2030 under three scenarios.The final section summarizes the main conclusions and presents the policy implications.

Literature Review
Energy estimation modeling has attracted wide spread interest among current practitioners and academicians.The commonly used econometric techniques include cointegration analysis, autoregressive integrated moving average (ARIMA) model, partial least square regression (PLSR), and vector error correction model.The ML method mainly refers to the AI model, support vector regression (SVR) method, and Grey forecasting method.Their details are described in the following sections.

Econometric Method.
Cointegration analysis can establish a long-run relationship among variables, and the forecasting results are reliably shown through tests ranging from unit root to cointegration analysis [2,3].Early studies such as Chan and Lee [24] and Lin [2] forecasted the total energy and electricity demands in China, respectively.They conducted a series of tests ranging from unit root test to cointegration test to guarantee that a cointegration relationship exists between energy demand and its factors (i.e., the nexus will not change in the medium and long term).The ARIMA model is presented as an appropriate method for long-term projections [4][5][6][7][8]25].This model depends on three parameters, including order of moving average, order of differencing, and order of autoregressive scheme.However, ARIMA cannot be employed with missing and nonstationary data; otherwise, the original data should be first transformed by differencing.Recently, Cabral et al. [7] considered the spatiotemporal dynamics in the conventional ARIMA model.Their results confirmed that the new spatiotemporal model improves the electricity demand forecasts in Brazil and is paramount to achieving the goals of the Brazilian electricity sector for a secured electricity supply.Contrary to the ARIMA model, PLSR is a popular statistical tool that can deal with data, especially missing or highly correlated data [26].However, PLSR was recently discussed in the field of energy demand estimation [26,27].For instance, Zhang et al. [26] employed the PLSR model to estimate the transportation energy demand in China on the basis of GDP, urbanization rate, passenger turnover, and freight turnover.Their results demonstrate that the transport energy demand for 2020 will reach a level of 4.3313 billion tons of coal equivalent (BTCE) and 4.6826 BTCE under different scenarios.

ML Method.
Any optimization technique requires information on future scenarios and a search for the best solutions against a test criterion.In this case, ML techniques are superior and are frequently used to solve these two problems.The ML models include several tools, such as the AI, SVR, and Grey forecasting methods.To motivate our research, we focused particularly on the AI-based model.
The concept of SVR is developed from the computation of a linear regression function in a high-dimensional feature space where the input data are mapped via a nonlinear function, which can be found in Vapnik [28] and Vapnik et al. [29].Dong et al. [19] were the first to employ SVR to predict the monthly energy use of buildings in tropical regions.Local weather data, including monthly average outdoor dry-bulb temperature, relative humidity, and global solar radiation, are selected as the factors affecting energy demand.Their results demonstrate that the relative error rate is less than 4%.Wang et al. [30] applied SVR for predicting hourly electricity use in residences and compared the results with other AI-based methods.They report that SVR improves the prediction accuracy.
Energy Grey forecasting model adopts the essential part of Grey system theory.In energy demand forecasting [18], the basic Grey model (GM (1,1)) was employed.Recently, Kang and Zhao [31] combined the moving average method and Markov model with GM to improve the accuracy of forecasting results.The improved Grey forecasting model demonstrates better performance compared with the conventional GM (1, 1).Xu et al. [32] combine GM and the Autoregressive and moving average model.The result indicates that the improved energy forecasting model has excellent accuracy and a high level of reliability for the case study of Guangdong Province.
AI-based prediction method predicts energy use according to its correlated variables, such as population growth, economic growth, and economic structure [2][3][4][5][6][15][16][17].For instance, Haldenbilen and Ceylan [10] proposed an AI model based on GA using population, GDP, and vehicle-km as affecting factors to forecast the transport energy demand in Turkey.Recently, Günay [23] modeled an electricity demand function for Turkey using the data on population, GDP per capita, inflation percentage, unemployment percentage, average summer temperature, and average winter temperature.Then, ANN is employed to determine the optimal weights that can maximize the accuracy of the function.The aforementioned algorithms can be called the single AI-based method.To eliminate several essential limitations in these algorithms, researchers also propose hybrid methods that integrate at least two AI algorithms, such as the GA-ANN [33] and PSO-GA models [12][13][14][15][16], to improve the prediction accuracy.The hybrid combination of a single AI algorithm shows greater performance compared with other methods.
The current AI-based prediction method is generally composed of four main steps: data collection, data preprocessing, model training, and model testing.With the superiority in time series processing, the AI-based model displays a good performance in predicting future energy demands.However, a spurious regression problem occurs in a wide range of time series analysis in econometrics owing to its nonstationarity.The current AI-based model cannot avoid this problem.If the selected variables do not satisfy the basic requirements of constructing a cointegration relationship over the sample period, the AI-based forecasting models cannot be employed to make energy demand projections because the nexus between energy demand and its factors will change in the medium and long term.Therefore, the mechanism for predicting energy demand should be reformulated.

Methodology
3.1.Introduction to AI-Based Energy Demand Model.In the precedent AI-based models, the commonly employed independent variables were around population, GDP, urbanization, industrialization, energy price, and energy mix.Three forms of the estimation models, including linear, quadratic, and exponential forms, were then adopted for data training [10,11,[15][16][17], which can be expressed as follows: where models (1), (2), and (3) are the linear, quadratic, and exponential forms, respectively.In each model,   is the th energy demand-affecting factor,  is the number of energy demand-affecting factors, and   and  , are corresponding weights.
The "fittest" weights are finally searched through different AI tools, such as GA, ACO, and hybrid algorithms, based on the fitness function employed to monitor the forecasting accuracy, which aims to minimize the sum of squared error between the actual and estimated values shown as follows: where  actual and  predicted denote the actual and predicted energy demand values, respectively. is the number of observations.After obtaining the optimal weights, the model was applied to forecast the future energy demand under different scenarios.Compared with the traditional econometric energy demand forecasting model, the proposed AI-based model frequently demonstrates higher prediction accuracy.However, according to economic theory, these periodical characteristics of economic variables will not change in the medium and long term when an economy remains in a consistent state.Consequently, their historical relationship between energy demand and factors in the sampling period should be entirely stable.When this relationship was satisfied, it could be used for forecasting energy demand.However, the current AI-based energy demand forecasting model does not determine this historical relationship through econometric and statistical analysis.This condition can be recognized as a "black-box" without knowing the internal relationship between energy demand and its affecting factors [33].Accordingly, this model cannot be adopted for energy demand prediction when the historical relationship estimated through the AI-based model will change over time.Therefore, the improved AI-based model framework should be presented to improve the reliability.

Improved AI-Based Model.
As indicated in the abovementioned conventional AI-based model, the AI tool is directly applied to obtain the optimal weights for the model after preprocessing the original data.Then, the model is employed to forecast future energy demand.However, the prediction results are not reliable when the variables cannot build a stable and long-run relationship or when the parameters will change over time.Therefore, the model stability tests should be performed before proceeding to obtain the fittest weights through the AI tools.The cointegration analysis is widely employed as a key econometric method to forecast mid-and long-run energy demand because it can establish a long-run relationship among variables [3].Cointegration theory and operations are employed to determine whether a long-run relationship exists between energy demand and its factors.To compare with the precedent AI-based model, our new framework for energy demand forecasting is shown in Figure 1(b) and the original framework described in previous literature is presented in Figure 1(a).As shown in Figure 1, if the energy demand and its factors cannot satisfy the cointegration relationship over the sample period, then this model cannot be adopted to predict future energy demand based on the current AI-based model because the stable relationship between them does not exist in the medium and long term.

Cointegration Test.
According to cointegration theory, the existence of a long-run equilibrium relationship among economic variables is based on the stationary linear combination of a time series.The cointegration relationship over the sampling period can be tested when the economic variables are integrated at () at the same time or at either (0) or  (1).Hence, the first step to conduct the cointegration analysis is by employing a unit root test approach to check the stationarity of the variables.In empirical studies, the methods, including augmented Dickey-Fuller (ADF) and Phillips-Perron (PP) tests, are commonly employed to test the time series.Cointegration tests such as Engle-Granger [34], Johansen-Juselius [35], and autoregressive distributed lag (ARDL) bound testing approach [36,37] can be adopted after the probability of building a cointegration relationship among the variables is verified.Engle-Granger method is feasible for testing single equation cointegration when the economic variables are integrated at () simultaneously.Compared with the Engle-Granger method, the Johansen-Juselius method can determine the number of cointegration vectors and test the existence of cointegration among variables.However, the Johansen-Juselius method can be employed when the variables are integrated at () simultaneously.Compared with the Engle-Granger and Johansen methods, ARDL bounds approach was recently applied to test the existence of a long-run equilibrium among the time series because it can establish the long-and short-run relationships among the variables.It also extended the mandatory requirements on the variables and can be applied even when the variables are integrated at (0), (1), or fractionally cointegrated.Third, the ARDL procedure is a more powerful approach to determine the cointegration relationship in small samples than the Johansen-Juselius technique.Finally, the problems of serial correlation and endogeneity are not difficult to tackle within the ARDL model.

Performance Test.
After verifying the existence of a long-run relationship among variables, the next step is to test the prediction accuracy performance of the model and determine whether the estimated parameters will change with time.Parameter inconsistency may result in poor consequences on inferences and lead to wrong conclusions.For the Johansen-Juselius cointegration test technique, the stability test for the vector autoregressive model should be conducted through the unit root analysis.Meanwhile, for the ARDL bound approach, the cumulative (CUSUM) and cumulative sum of squares (CUSUMSQ) are suitable for the stability test because their statistics are updated recursively and plotted against the break points.

Forecasting Electricity Demand in China
In this section, electricity demand forecasting in China is shown as an example based on our new framework.First, we list the electric energy demand-affecting factors and the proxy variables based on existing electricity demand prediction literature.Second, given that the AI-based model does not require many explanatory variables, we employ the cointegration analysis to select the variables that can be contributed toward building a long-run relationship.Third, we employ the adaptive genetic algorithm (AGA), which is superior to conventional AI algorithms, to optimize the model.Based on the above estimation, three scenarios for economic growth, namely, low (Scenario A), business as usual (Scenario B), and high (Scenario A), are set.Finally, the electricity demand projections for the three scenarios are conducted.

Electricity Demand-Affecting Factors.
Electricity demand can be viewed as a causal function of several affecting factors, such as population, GDP, electricity prices, economic structure, urbanization, and life styles [15][16][17].In line with Lin [2] and Yu et al. [15], the number of total populations is employed as a reflector of population growth on electricity demand.Secondly, economic growth influences the electricity demand mainly through various ways.Inconsistent with most existing literature, GDP is adopted as a key factor of electric energy demand.Thirdly, in China, the amount of energy required to produce a unit of GDP differs significantly among the three industries.Numerous researchers have noted that secondary industries are the main electricity consumers [2].Therefore, the economic structure shifted from the secondary industry (e.g., heavy industry) to the tertiary industry (e.g., high-technology industry) may directly reduce electricity consumption [2].In line with Feng et al. [38] and Li et al. [39], we take the ratio of the tertiary sector in GDP to capture the effect of relative change in economic structure on electricity demand.Fourthly, urbanization in China is one of the key stages exerting an important but complicated impact on electricity demand.In each year, millions of immigrants moved from rural to big cities in search of good job opportunities, which exerted a great influence on electric energy demand.This study utilizes urbanization rate to control the effect of urbanization on electricity demand.Lastly, according to economic theory, the price can affect the demand through income and substitution effects.However, adjusting the electricity price in China is complicated because it has been under the full control of the government for years and is mainly determined by the production cost.In addition, the price among the regions differs significantly, and no practical method can estimate the electricity price in China [2].Therefore, we employ the price index for fuel and power to denote the electricity price index.Economic structure is denoted by the ratio of output in the tertiary industry to GDP (%).The share of the urban population to the total population is used to substitute the urbanization rate (%).To represent the price for electricity, we use the price index for fuel and power to denote the electricity price.The price index for the base year (1985) is assumed to be 1.00, and the price index for other years is adjusted to the constant price index of 1985.The definition of the variables is shown in Table 1.The trends for electricity demand and its factors in the period of 1985-2015 are shown in Figure 2.

AGA.
As mentioned in Section 2, the conventional AI algorithms also experience low prediction accuracy.However, the hybrid AI algorithms (e.g., PSO-GA and ANN-GA) are complicated.In this study, we employ AGA, which has a more profound intelligent background and yields good efficiency in optimizing global coefficients.The major difference between traditional GA and AGA is the selection of crossover probability   and mutation probability   .In conventional GA, the two probabilities are randomly determined or based on an inadequate reference, whereas AGA relies on the fitness function to select the optimal   and   .The flowchart for AGA is shown in Figure 3.In this figure, AGA contains major operations, including initialization, judgement and selection, crossover, and mutation.The detailed descriptions of the operations are given as follows: (A) Initialization.The parameters, including the number of pop sizes (Pop size) and the number of generations (Num generations) are first set.The fitness functions, crossover probability, and mutation probability should also be determined.
(B) Judgement and Selection.Population is determined and ranked according to the values of fitness function.The individual with a low fitness value can be selected as the optimal solution for the problem when the number of generations is equal to the maximal number of generations; otherwise, the individual with a low fitness value is selected and the rest of the individuals are replaced by the selected individuals.(C) Crossover.To gain good performance in AGA, the crossover probability is determined by a function compared with a constant in conventional GA.A high crossover probability will be set when the fitness value is less than the average value; otherwise, a high fitness value will lead to a low crossover probability.A crossover probability function employed in our study is shown as follows: where   denotes the crossover probability function and  max and  avg represent the maximum fitness and average fitness values, respectively. 1 = 0.9 and  2 = 0.6 are set in step (A).
(D) Mutation.The selection of mutation probability values is the same as that for the selection of crossover probability values.The mutation probability is set to 0.1 when the fitness value is lower than the average value; otherwise, a high fitness value will result in a high crossover probability value.The corresponding mutation probability function is given as where   refers to the mutation of probability function;  1 = 0.1 and  2 = 0.01 are determined in the first step; and  max and  avg represent the maximum and average fitness degrees for each generation, respectively.

Unit Root Test.
After data collection and preprocess, the first step is to perform the unit root test to determine whether the time series satisfies the basic conditions for constructing the cointegration relationship.Considering that ADF and PP tests are distorted in small sample sizes, the Ng and Perron [40] unit root test is adopted using only the intercept term and the presence of intercept and trend terms in the unit root estimating equation.The corresponding unit root tests are shown in Table 2.
The first six rows in Table 2 are the unit root test results with the presence of an intercept term, while the last six rows are unit root test results with the presence of intercept and trend terms.This result indicates that all the time series are first-difference stationary at the 10% significance level with the presence of an intercept term, a situation that satisfies the necessary requirements for building the cointegration relationship of Johansen-Juselius technique [35] and ARDL bound testing approach [36] simultaneously.

Cointegration Test.
Next, we conducted the test to determine the presence of a long-run relationship using the ARDL bounds testing approach.The ordinary least squares (OLS) procedure is first employed for the next equation, which is expressed as where  denotes the "ln" form of electricity demand, and   ( = 1 ⋅ ⋅ ⋅ 5) refers to the ln form of population, GDP, economic structure, urbanization rate, and price of electricity, respectively. 0 is a constant parameter, and   denotes the white-noise process.Δ represents the first-difference operator.
To obtain the optimal lag length for the equation, the ARDL bounds approach should estimate ( + 1)  times of regressions ( and , resp, denote the maximum number of lags and the number of variables).The Schwarz-Bayesian criteria (SBC) or Akaike information criterion can then be adopted to determine the optimal lag for this regression.The bounds testing procedure based on the joint -statistics and Wald statistics is illustrated as follows.
The null hypothesis in the equation is  0 :   = 0 against the alternative of  1 :   ̸ = 0,  = 1, 2, . . ., 5. Two sets of critical values are reported in Pesaran et al. [36] and Narayan [37].The bound statistics in Pesaran et al. [36] are only applicable for a sample size with more than 80 observations; otherwise, Narayan [37] is appropriate.Considering that our sample size is 31 (from 1985 to 2015), the critical values from Narayan [37] for the bounds -test are more suitable than those from Pesaran et al. [36] to establish the reliable inferences on cointegration.
Null should be rejected when the calculated -statistics exceeds the upper bound, suggesting that cointegration relationship exists between electricity demand and its factors.No cointegration is found when the calculated -statistics is below the low critical value.However, few approximate conclusions can be drawn without knowing the order of integration of the underlying regressors when the statistics are located between the bounds.The corresponding results are shown in Table 3.As mentioned by Canyurt and Öztürk [11], the AI-based model did not require many factors to estimate future energy demands.The cointegration tests (Table 3) are presented using both five input variables and four variables, respectively.
As shown in Table 3, the existence of two cointegration relationships among the variables is confirmed.Hence, the electricity demand functions, including  2 and  6 , can be applied to estimate future electricity demands.In the following, we employ the electricity demand function  2 in which population, GDP, economic structure, and urbanization are used as the independent variables to predict future electricity demands.

Stability Test. CUSUM and CUSUMSQ are applied to
show the stability of the model, as shown in Figure 4.
In the figure, the plots of CUSUM and CUSUMSQ are located within the critical bounds at the 5% significance level, which suggests that the model is stable.Accordingly, the cointegration relationship between electric energy demand and its factors is reliable.

Estimating Results.
After the long-run equilibrium relationship among the variables is verified, AGA is employed to optimize the coefficients of (1)-( 2) (since we employ the ln form of variables, after carrying out the weights of ( 1)-( 2), the electric energy demand can be obtained using  =   lin or  =   qua ).The electricity demand function, which uses four factors as the economic indicators, namely, GDP, population, economic structure, and urbanization, is selected in the model to predict the future energy demand in China.
To estimate the coefficients for the linear and quadratic forms, the observed data from 1985 to 2015 are used.The linear form for the optimal model is written as follows (for simplicity, we do not present the results in exponential form): where  denotes the "ln" form of electricity demand and   ( = 1, 2, 3, 4) stands for the "ln" form of population, GDP, economic structure, and urbanization.Population growth, economic growth, and urbanization are the leading forces The straight lines represent critical bounds at 5% significance level contributing to the increase of electricity demand, a finding that is consistent with our expectations.Elasticity coefficients show that a 1% increase in population, economic growth, and urbanization will produce respective increases of 1.3754%, 0.3273%, and 2.254% in electricity consumption.By contrast, a 1% increase in the ratio of tertiary sector to GDP will produce a 1.1367% decline of electric energy consumption.
In addition, the quadratic form for the optimal model is expressed as where   and    represent the actual and fitted values, respectively, and  is the number of observations.All these approaches use the population, GDP, industrial proportion, and urbanization rate of China as the independent variables.The training data  are employed to fit the historical relationship between electricity demand and its factors, and the testing data (2011-2015) are adopted to test the performance.The comparison of these criteria for the testing data among various optimal electricity demand models is reported in Table 4.The table demonstrates that MAPE of the quadratic form is the lowest compared with the linear forms optimized by AGA and other methods (GA, ACO, GM, and OLS).In addition, the proposed linear form of AGA achieves better prediction accuracy than other forecasting optimal models.The actual and simulated data for the optimal model from 1985 to 2015 are also shown in Figure 5, which reveals that the proposed model fits the historical data well.Thus, the discussed AGA algorithm effectively enhances the estimating precision of the model.

Future Projection.
The abovementioned framework is applied to forecast electricity demand from 2016 to 2030 based on three scenarios.The trends of the affecting factors are described as follows: GDP.A declining trend is observed for the GDP growth of China in recent years.For instance, the annual GDP average growth rates of China during 2005-2010 and 2010-2015 are 11.36% and 7.86%, respectively.Today, China has entered a stage of development called "new normal," which indicates that significant uncertainties may occur in the economic development of China.Hence, the possible impacts of different economic growth rates on electricity consumption should be considered.We set three scenarios for economic growth similar to Lin and Ouyang [3]: high (Scenario A), business as usual (Scenario B), and low (Scenario C).In Scenario A, the average growth rate of GDP between 2016 and 2020 is set as 7%.In Scenario B, the average growth rate is 6.5% because China has to fulfill its national goal established in the 13th Five-Year Plan.In Scenario C, the average growth rate is assigned as 6% in 2016-2020, the lowest among the three scenarios.Additional details can be found in Table 5.
Population.The annual growth rate of population in 2010-2015 was approximately 0.5%.We assume that the implementation of the current "two-child policy" will promote population growth.Therefore, we hypothesize that the annual growth rates of population for the periods 2016-2020, 2021-2025, and 2026-2030 are 0.60%, 0.70%, and 0.75%, respectively.
Economic Structure.According to the 13th Five-Year Plan of China, the share of tertiary industries to GDP will be over 56% in 2020, indicating that the annual growth rate of tertiary industries to GDP will be at least 2.09% during 2016-2020.Currently, the economy of China is transitioning from primary to secondary to tertiary industries.Therefore, the annual growth rates of tertiary industries to GDP for the periods of 2021-2025 and 2026-2030 are assumed to be 2.2% and 2.5%, respectively.
Urbanization.The rapid urbanization process of China is expected to end in 2020 [41].Additionally, the urbanization process of China will follow an s-curve track, which is similar to the historical experience of most developed countries [3].We assume that the average annual growth rate for urbanization will decelerate in 2020 and will further decrease to 1.5% in 2016-2020, 0.8% in 2021-2025, and 0.4% in 2026-2030.In summary, the projected GDP growth rates, population, economic structure, and urbanization for the three scenarios are shown in Table 5.
The electricity demand of China can be forecasted after the assumptions of the factors are established.In Figures 6, 7, and 8, the national electricity demand estimates under three scenarios with the linear and quadratic forms are shown, respectively.
The electricity demand of China will continue growing in the medium and long term regardless of the adjustments in economic structure.Under the high-growth scenario (Scenarios A), the electricity demand will still increase rapidly because of the economic growth, urbanization process, and population growth of China.However, the annual growth rate of electricity demand will decrease to 5.8% during the 2016-2030 period in Scenario A owing to the decline in annual growth rate of economic growth and the adjustment in economic structure.This value is much lower compared with that in the period of 2000-2015.In 2020, 2025, and 2030, the electricity demand of China will be 8.2585, 11.139, and 13.821 trillion KWH according to the quadratic form of this model (Figure 6).The minimum predicted values for 2020, combined with 2025 and 2030, were obtained by using the linear form of the optimal model, which result in 8.0322, 10.758, and 13.302 trillion KWH, respectively, under Scenario A. The maximal predicted electricity demand in Scenario  B in 2020 is about 8 trillion KWH (Figure 7), whereas the minimum is 7.8806 trillion KWH according to the linear form of the model.By 2025, the electricity demand consumption in Scenario B will save about 0.4 trillion KWH compared with Scenario A due to the lower growth rates in GDP.In 2030, the electricity demand in Scenario B will consume less than 0.6 trillion KWH compared with Scenario A. As shown in Figure 8, the electricity demands of China are small under Scenario C. In this scenario, the electricity demand will consume about 7.9, 10.3, and 12.4 trillion KWH in 2020, 2025, and 2030, respectively.

Results and Policy Conclusions
In this study, we develop a new framework to predict energy demand based on the conventional AI models and cointegration theory.To develop energy forecasts, we emphasize the use of appropriate data and econometric techniques  rather than several computer packages for demand estimation techniques provided by previous studies.In this new framework, the energy demand-affecting factors, which are used as the independent variables in the prediction model, are determined based on theoretical analysis and selected by statistical and econometric analysis or tests.Finally, the future electricity demands of China from 2016 to 2030 are predicted as an example for the new model by using the modified AI-based model.Compared with several previous AI-based literatures, we prove that the present forecasting model demonstrates exceptional performance in forecasting electric energy demand.
The prediction results of electricity demand indicate that population growth, economic growth, and urbanization are the leading forces contributing to the increase of electricity demand, whereas economic structure adjustment is responsible for the decline of electricity consumption.Several specific results are listed below: an electricity demand growth is observed in China in the following years (i.e., 2016-2030).However, the future annual growth rate is lower compared with the last decades.Based on our analysis, electricity demand will still continue to increase at an annual average rate of about 5.5% and will be about 13 trillion KWH in 2030.This value corresponds to nearly two times compared with the 2015 level.The forecasts would be valuable for policy makers in China in planning future energy policies.

Figure 1 :
Figure 1: The frameworks for the conventional and new AI-based energy demand forecasting model.

Figure 3 :
Figure 3: The flowchart of adaptive genetic algorithm.

Figure 6 :
Figure 6: Electricity demand in Scenario A.

Figure 7 :
Figure 7: Electricity demand in Scenario B.

Figure 8 :
Figure 8: Electricity demand in Scenario C.

Table 1 :
Definition and description for the variables.
Management.To forecast the electricity demand, we use 31 years of observed data from 1985 to 2015.Electricity consumption in each year is measured in trillion(10 12) kilowatt hours (KWH), and population is measured as 100 million (10 8 ) persons.GDP data are measured in trillion Yuan(10 12) RMB and adjusted to the constant price in 1985.

Table 2 :
Results based on Ng-Perron unit root test.

Table 4 :
Prediction accuracy test for the optimal model.

Table 5 :
Hypothesis of variables for the three different scenarios (unit: %).