A Novel Robust Grey Model for Forecasting Chinese Electricity Demand

,


Introduction
Electricity demand monitoring, forecasting, and warning early are of importance in both energy and economic felds, which is very close to industrial activities and human activities [1][2][3]. Also, electricity demand is referred to as an important impact factor in electricity generation, which is benefcial to regulate the schedule of electricity generation for the operator in the electricity system [4][5][6]. In addition, electricity demand, to some degree, serves as an important indicator in macroeconomic performance for policymakers around the world, which is highly associated with a large number of economic variables, such as gross domestic product and gross national product [7][8][9]. Terefore, the fuctuation in electricity demand would have a great efect on the society [10]. An accurate and reliable approach to the prediction of the future demand in electricity is needed by policy maker and electricity producer.
However, it is not an easy task to predict the future demand in electricity in practice [11][12][13]. On the one hand, electricity demand would be afected by a large number of factors, including population, economic growth, and climate change [14,15]. Tese factors also have a large uncertainty and fuctuation, which adds the difculty in forecasting the future demand in electricity [16]. On the other hand, the time series of electricity demand itself displays less stability, which makes the prediction of electricity demand be full of challenges [17]. Tus, a robust and reliable approach to forecast the future demand in electricity is needed from a methodological perspective.
According to previous research by Hernandez et al., who reviewed the literature related to the techniques of the prediction of electricity demand over the past 40 years, these techniques of the prediction of electricity demand could be divided into three groups [18]. Te frst group applies machine learning techniques [19]. Te second group uses statistical approaches [20]. Te third group adopts grey models [21]. Tese machine learning techniques include artifcial neural network and support vector machine. Te disadvantage of the machine learning technique is that it requires a lot of observations in the sample for learning. Tese statistical approaches include parametric regression, semi-parametric regression, and non-parametric regression. In the feld of time series, statisticians often employ the autoregressive moving average model, or the vector autoregressive model, to ft the data in sample and to make predictions out of sample. Te statistical model has its drawback, although it is easy to know the idea behind the statistical model. Tat is, these statistical models, like autoregressive moving average and vector autoregressive, heavily rely on data collection and parametric estimation. Te grey model is likely to be the most appreciative approach to make predictions of electricity demand, across these three groups that we mentioned previously, which is very specialized in coping with poor information and small samples.
Te grey model was proposed by Professor Deng in the year of 1982 [22]. It is quite popular among a great number of applications in prediction because the grey model has a strong capability of capturing the characteristics of a system with uncertainty. Besides, the grey model has a high accuracy of prediction when it is applied in a sample with few observations. Like the machine learning technique and the statistical approach, the grey model is also a collection of a series of grey models. Among them, the classical grey model, which is abbreviated as GM (1,1) in the literature related to the grey theory, is the most popular and the most used one. Previous studies have pointed out that the classical grey model efciently deals with these problems faced by the grey system, particularly insufcient observations and uncertain circumstances. Tus, it is a good choice to use the grey model to predict the future demand in electricity.
Te prediction of the electricity demand, also, should be referred to as a grey system problem because it could be afected by great quantities of uncertainty. Tese factors, including the total population, the level in economic development, and weather conditions, all afect the accuracy and the reliability of prediction. It is a pity, however, that we do not exactly know how many factors would afect the electricity demand, as well as how these factors afect the electricity demand. Besides, as we know, some emerging countries like China have a short duration of time series of the electricity demand, and at the same time, the electricity demand in these countries is rapidly increasing, which adds difculties in forecasting the future demand in electricity using the predictive models, such as machine learning techniques and statistical approaches. Terefore, the classical grey model provides a good alternative to machine learning techniques and statistical approaches to predict the future demand in electricity.
Te literature related to the prediction of electricity demand using the grey model is quite large and is expected to continue to rise in the future [23,24]. Hu used a neural network-based grey model to forecast the future demand in electricity [25]. Zhao et al. proposed a rolling grey model and provided predictions of the electricity demand over a relatively long period [26]. Bahrami et al. developed a grey model with a microwave transformation and used it to forecast the future demand in electricity in a relatively shortrun term [27]. Xu considered an optimized algorithm to update the grey model for the projection of Chinese electricity demand [28].
Although these grey models, proposed by the previous studies, have a great amount of obvious advantages and are better than those traditional models, such as machine learning techniques and statistical approaches, they have some problems. For example, a large number of the existing grey models estimate the structural parameters using the ordinary least squares estimation, which assumes that there do not exist outliers in the sample. Recent studies have demonstrated that due to outliers occurring in the sample, the grey model sufers from poor robustness as well as a low predictive accuracy [29,30]. In order to solve this problem, we introduce least trim squares estimation into the classical grey model, that is, GM (1, 1), and propose a novel robust grey model to predict the Chinese electricity demand. In addition, we consider to use the new information priority criterion to further improve the novel proposed robust grey model. Up to this point, we name the novel grey model the novel robust grey model integrating the new information priority criterion, which could be abbreviated as NIPC-RGM (1, 1).
Te rest of this paper is organized as follows. Section 2 describes the classical grey model and the novel proposed robust grey model. Section 3 reports the results, which include the robustness and the accuracy of the novel proposed robust model, compared with the classical grey model. To test the robustness of the grey model, we introduce a novel approach, that is, the bootstrapping test, whose implementation steps would be explained in corresponding part. Section 4 concludes this paper.

Methods
In this section, we frst describe the existing grey model, which is also called GM (1, 1) in the literature on grey theory. Ten, we illustrate how the novel robust grey model, which we propose in this paper, is implemented by researchers and analysts in practice. Te novel model, proposed by us, using the least trim squares to estimate the structural parameters and using the new information priority criterion to improve its capability of prediction is abbreviated as NIPC-RGM (1, 1). Finally, we present approaches, which are used to test the 2 Discrete Dynamics in Nature and Society robustness to outliers for models, as well as indicators, which indicate the accuracy of prediction. (1,1). In this section, we provide a brief introduction to the existing grey model, that is, GM (1,1). Suppose that there is a time series, whose entries are not negative. Te time series is described as follows:

Te Existing Grey Model, GM
Using the frst-order accumulative generation operation, we obtain a new time series, which is described as follows: where x (1) (k) � k i�1 x (0) (i) and k � 1, 2, . . . , m. Te following equation: is called the basic diferential equation for the classical grey model, that is, GM (1, 1). z (1) (k) is calculated by the following formula: which is known as the background value in the literature on the grey model. Te following equation: is defned as the whitening diferential equation for the classical grey model, that is, GM (1, 1).
then the structural parameters, [α, β] T , in the classical grey model, that is, GM (1, 1), could be estimated using the least squares, which is described as follows: Te general solution for the classical grey model, that is, GM (1, 1), could be written as where C represents an arbitrary constant.
If we set x (1) (1) to be x (0) (1) and we set k to be 1, then the constant C could be calculated by the following formula: We substitute equation (9) into equation (8) and obtain the time response for the classical grey model, that is, GM (1, 1). Tat is, Tus, using the estimates in the structural parameters obtained from the least squares, the prediction of x (1) (k) could be calculated using the following equation: Te prediction of x (0) (k) is obtained by the inverse frstorder accumulate generation operation, which is

Te Novel Robust Grey Model Integrating New Information Priority Criterion.
Here, we illustrate the structure of the novel robust grey model that is proposed in this paper, which is expected to be robust to outliers. Te novel model applies the least trim squares to estimating the structural parameters and adopts the new information priority criterion to enhance the accuracy of prediction. We name the novel model the robust grey model integrating the new information priority criterion, which could be abbreviated to NIPC-RGM (1, 1). In the following, we frst explain the least trim squares method that is used to estimate the structural parameters. Ten, we demonstrate the new information priority criterion that is used to optimize the initial condition for the grey diferential equation. Finally, we present a complete algorithm for the novel robust grey model integrating the new information priority criterion, that is, NIPC-RGM (1, 1).

Te Least Trim Squares.
Distinguished from the ordinary least squares estimation, used to estimate the structural parameters in the classical grey model, the least trim squares estimation shows two advantages. On the one hand, it investigates the order of residuals squared, which probably is benefcial to improve the accuracy of prediction.
On the other hand, it reduces the infuence due to outliers and enhances the robustness to outliers.

Defnition 1.
Suppose that there is a series of points, arranged according to the order of the time, that is, It satisfes a simple regression model, that is, Discrete Dynamics in Nature and Society Defnition 2. Te structural parameters in the classical grey model are obtained using the ordinary least squares estimation, that is, equation (7), which could be rewritten using the following formula: Defnition 3. Te structural parameters in the novel robust grey mode are obtained from the least trim squares estimation. Tat is, where h represents the trimming constant, indicating that there are h observations with the relatively small residuals, which would be used to estimate the structural parameters in the novel robust grey model. In this paper, we set the trimming constant to be m/2. Tat is, we keep half of observations with the relatively small residuals to estimate the structural parameters in the novel grey model. Comparing Defnition 2 and Defnition 3, we could see that if the trimming constant is equal to the number of observations in the total time series, then the estimates obtained from the least trim squares estimation would be the same as the estimates obtained from the least trim squares estimation. By Defnition 3, we also see that the least trim estimation eliminates these observations with the relatively large residuals, which could be referred to as outliers.

Te New Information Priority Criterion.
In the literature on the classical grey model, most of articles use x (0) (1) as the initial value, that is, the oldest value in the original time series. According to the new information priority criterion, we use the newest value, that is, x (1) (m), as the initial value. Theorem 1. Given α lts and β lts , obtained from equation (15), the following conclusion could be summarized.
Te solution of the whitening grey diferential equation for the novel robust grey model integrating new information priority criterion, that is, NIPC-RGM (1, 1), could be written as Te prediction from the above time responses is Proof. First, we consider a general solution for the novel robust grey model based on least trim squares estimation: We set k to be m and obtain Ten, we have Terefore, the time responses would be So far, Teorem 1 is proved.

Te Novel Robust Grey Model
Integrating the New Information Priority Criterion. Now, we present the complete implementation steps, which could be described as follows: Step 1. Obtain the raw time series, that is, X (0) , as well as its frst-order accumulative generating series, that is, Step 2. Calculate the background values, that is, Z (1) .
Step 3. Estimate the structural parameters using the least trim squares estimation.
Step 4. Calculate the predictions of the frst-order accumulative generating series, that is, X (1) .
Step 5. Obtain the prediction of the raw time series, that is, X (0) .

Tests for Robustness and Accuracy.
Te evaluation of the capability of the novel robust grey model integrating the new information priority criterion includes two aspects, that is, the robustness and the accuracy of the novel proposed model.
In order to evaluate the robustness of the novel robust grey model integrating the new information priority criterion, we perform a series of bootstrapping tests. Tat is, frst, we randomly choose a number from the interval between the maximum and the minimum in the original time series. Second, we replace the original value in a particular year, such as the year of 2020, with the aforementioned number chosen randomly, which forms a simulated time series with an outlier in the particular year. Tird, we apply the novel robust grey model integrating the new information priority criterion to make predictions based on the aforementioned formed simulated time series. Fourth, we calculate the mean absolute percentage error using the prediction and the values in the original time series. Fifth, we repeat the above steps 1000 times and obtain an empirical distribution of mean absolute percentage errors in a particular year when an outlier occurs. Sixth, we perform the bootstrapping test for the classical grey mode. Seventh, we compare the distribution from the novel robust grey model integrating the new information priority criterion with the distribution from the classical grey model. If the range of the distribution from the novel robust grey model integrating the new information priority criterion is smaller than the range of the distribution from the classical grey model, then we would integrate that the robustness of the novel robust grey model integrating the new information priority criterion is better than the robustness of the classical grey mode, GM (1, 1). If, on the other hand, the novel robust grey model integrating the new information priority criterion has a larger range of the distribution than the classical grey model, then our integration would be that the robustness of the novel robust grey model integrating the new information priority criterion is not better than the robustness of the classical grey model.
On the other hand, to compare the novel robust grey model integrating the new information priority criterion to the classical grey model, we use two statistical indicators. Tey are correlation coefcient and mean absolute percentage error, which are defned as

Results
In this section, we frst investigate the robustness of the novel robust grey model integrating the new information priority criterion, using the bootstrapping technique that is described in the previous section in details. Ten, we compare the accuracy of prediction for the novel robust grey model integrating the new information priority criterion and the classical grey model. Finally, we forecast the future values in Chinese electricity demand during the years 2023 to 2025. Figure 1, we plot the empirical distribution of mean absolute percentage errors from the bootstrapping test for a particular year, when an outlier occurs, using the classical grey model over the period from 2011 to 2018. Figure 1 is divided into eight panels. In Panel A, we set the value in the year of 2011 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel B, we set the value in the year of 2012 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel C, we set the value in the year of 2013 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel D, we set the value in the year of 2014 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel E, we set the value in the year of 2015 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel F, we set the value in the year of 2016 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel G, we set the value in the year of 2017 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series. In Panel H, we set the value in the year of 2018 as an outlier, which is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series.

Te Robustness to Outliers. In
In Figure 2, we plot the empirical distribution of mean absolute percentage errors from the bootstrapping test for a particular year, when an outlier occurs, using the novel robust grey model integrating the new information priority criterion over the period from 2011 to 2018. Figure 2 is divided into eight panels. In Panel A, we set the value in the year of 2011 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel A in Figure 1. In Panel B, we set the value in the year of 2012 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel B in Figure 1. In Panel C, we set the value in the year of 2013 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel C in Figure 1  Discrete Dynamics in Nature and Society minimum of the original time series, which is in regard to Panel D in Figure 1. In Panel E, we set the value in the year of 2015 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel E in Figure 1. In Panel F, we set the value in the year of 2016 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel F in Figure 1. In Panel G, we set the value in the year of 2017 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel G in Figure 1. In Panel H, we set the value in the year of 2018 as an outlier that is repeatedly obtained using the bootstrapping technique at the interval between the maximum and the minimum of the original time series, which is in regard to Panel H in Figure 1.
Comparing Figure 1 with Figure 2, we could see that across all the panels except Panel A, the range of the distribution from the novel robust grey model integrating the new information priority criterion is smaller than the range of the distribution from the classical grey model, in the corresponding year when an outlier occurs. For example, the range of the distribution in Panel D of Figure 2 is about half size of range of the distribution in Panel D of Figure 1, suggesting that the robustness of the novel robust grey model integrating the new information priority criterion is better than the robustness of the classical grey model.
Besides, we see that the mean of the distributions from the novel robust grey model integrating the new information priority criterion is more close to zero than the mean of the distributions from the classical grey model in the corresponding year when an outlier occurs. For example, the mean in Panel B of Figure 1 is larger than 0.04, while the mean in Panel B of Figure 2 is smaller than 0.04, indicating that the novel robust grey model integrating the new information priority criterion could have a higher accuracy of prediction than the classical grey model.

Te Accuracy of Prediction.
Here, we test the accuracy of prediction. We divide the total dataset into two datasets. One represents the data during the period 2011 to 2018, while the other represents the data during the period 2019 to 2021. Te former is referred to as the training set, while the latter is referred to as the test set. In order to illustrate that the novel robust grey model integrating the new information priority criterion has a better capability of prediction than the classical grey model whether an outlier occurs in the sample, we provide two settings in the current analysis. One represents the setting with an outlier, while the other represents the setting without an outlier. In the setting with an outlier, we consider the value in the year of 2015 as an outlier with 9.180, whose real value is 5.801. Table 1 reports the results during the period 2011 to 2021, when there exists an outlier in the sample. In columns (1) and (2), we provide the predictive values and absolute percentage errors, respectively, obtained from the classical grey model, that is, GM (1,1). In columns (3) and (4), we provide the predictive values and absolute percentage errors, respectively, from the novel robust grey model integrating the new information priority criterion, that is, NIPC-RGM (1,1). In the bottom of Table 1, we report mean absolute percentage error and correlation coefcient, which are the indicators of accuracy of prediction. From Table 1, we see that the correlation coefcients are the same for the two models, while the novel robust grey model integrating the new information priority criterion has a lower mean absolute percentage error in the test set than the classical grey model, indicating that the former has a higher predictive accuracy than the latter when an outlier occurs in the sample. Table 2 reports the results during the period 2011 to 2021, when there does exist outliers in the sample. In columns (1) and (2), we provide the predictive values and absolute percentage errors, respectively, obtained from the classical grey model, that is, GM (1, 1). In columns (3) and (4), we provide the predictive values and absolute percentage errors, respectively, from the novel robust grey model integrating the new information priority criterion, that is, NIPC-RGM (1,1). In the bottom of Table 2, we report mean absolute percentage error and correlation coefcient. Tey are referred to as the indicators of accuracy of prediction. From Table 2, we fnd that the novel robust grey model integrating the new information priority criterion has the same value in correlation coefcient as the classical grey model, while the former has a lower mean absolute percentage error in the test set than the latter, suggesting that the novel robust grey model integrating the new information priority criterion has a higher predictive accuracy than the latter when there is no outlier in the sample.

Te Forecasts of Chinese Electricity Demand during the
Period 2023 to 2025. Previously, we have illustrated the robustness and the accuracy of prediction for our proposed novel robust grey model integrating the new information priority criterion. Here, we apply the novel robust grey model integrating the new information priority criterion to forecasting the future values in Chinese electricity demand from 2022 to 2025, that, is the values in the next four years. Table 3 reports the results. From Table 3, we see that Chinese electricity demand would continue to rise at a quicker speed in the next four years.

Conclusion
In this paper, we propose a novel robust grey model based on the least trim squares estimation. Te novel grey model also integrates the new information priority criterion. We refer to it as the novel robust grey model integrating the new information priority criterion, which could be abbreviated as NIPC-GM (1, 1). We demonstrate the implementation steps for the novel robust grey model integrating the new information priority criterion. Also, we provide the evidence that the novel robust grey model integrating the new information priority criterion has a more excellent performance on the prediction of Chinese electricity demand than the classical grey model.
Our work contributes to grey models by focusing the issues related to outliers, which often take place in practice due to an incorrect record by chance or an accidental failure in equipment. Te issues are little explored by the literature related to grey models, although recent literature has pointed out that due to outliers occurring in the sample, the grey model sufers from poor robustness and a low predictive accuracy. In this paper, we try to solve this problem. We introduce least trim squares estimation to estimate the structural parameters in the classical grey model. Our study also proposed a novel approach to test and illustrate the robustness of grey models, which adopted the bootstrapping technique to form a novel sample including artifcial outliers. Tis approach also could be generalized to compare the robustness across a set of grey models and between grey models and other predictive models such as autoregressive integrated moving average models and machine learning models. In addition, we apply our novel robust grey models to predict Chinese electricity demand, which is a time series with large uncertainty. We fnd that the robustness to outliers is better when the series is modeled by the novel robust grey model than when the series is modeled by the classical grey model. Finally, we see that the accuracy of prediction is better when the series is modeled by the novel robust grey model than when the series is modeled by the classical grey model.
Of course, our work has limitations. For example, in this paper, we set the trimming constant to be half of the number of observation and exclude the probability of other value that the trimming constant is set to be, where the novel robust grey model integrating the new information priority criterion could have a higher accuracy of prediction. Future research is needed to investigate whether the value of the trimming constant would afect the predictive accuracy of the novel robust grey model integrating the new information priority criterion. Besides, future inquiry into comparison between the novel robust grey model integrating the new information priority criterion and the other robust grey models is needed.

Data Availability
Te data used to support the fndings of this study are included within the article.

Disclosure
Tis paper does not refect an ofcial statement or opinion from the organizations.

Conflicts of Interest
Te authors declare that they have no conficts of interest.

Authors' Contributions
All authors equally contributed to this paper. Cong Wei conceptualized the study and provided supervision. Jiayang Kong collected the data, conducted the statistical analysis, and drafted the manuscript. Riquan Yao and Shaojun Jin contributed to the interpretation of the results. All authors provided critical feedback on drafts and approved the fnal manuscript.   Note. Te unit of electricity demand is 10 5 million kW·h.
Discrete Dynamics in Nature and Society 11