Grey prediction models have been widely used in various fields of society due to their high prediction accuracy; accordingly, there exists a vast majority of grey models for equidistant sequences; however, limited research is focusing on nonequidistant sequence. The development of nonequidistant grey prediction models is very slow due to their complex modeling mechanism. In order to further expand the grey system theory, a new nonequidistant grey prediction model is established in this paper. To further improve the prediction accuracy of the NEGM (1, 1, t2) model, the background values of the improved nonequidistant grey model are optimized based on Simpson formula, which is abbreviated as INEGM (1, 1, t2). Meanwhile, to verify the validity of the proposed model, this model is applied in two real-world cases in comparison with three other benchmark models, and the modeling results are evaluated through several commonly used indicators. The results of two cases show that the INEGM (1, 1, t2) model has the best prediction performance among these competitive models.
Shanghai University of Finance and EconomicsAcademic Program of the Association for Science and Technology of Jinhua2019201. Introduction
Time-series forecasting has received extensive attention in the past decades. Accordingly, there exist a plenty of approaches for time-series analysis and forecasting. From [1, 2] and the references therein, these methods can be divided into three categories: statistical methods (e.g., regression analysis [3], functional state space model [4], logistic regression [5], spatial-temporal model [6], Markov chain model [7], etc.), machine learning methods [8–11], and grey modeling technique [12–14]. Each method has its own advantages and limitations [15]. Regarding statistical methods and machine learning methods, the growing body of data is required in the modeling procedure. However, it is difficult to collect data available for model calibration in practical applications [16–18]. This is the principal reason that the grey model has been widely used in various disciplines [19, 20].
Grey prediction models proposed by Deng [21] have been widely used in various fields of society due to their high prediction accuracy. In particular, the GM (1, 1, t2) model is an important branch of the grey prediction model, which was pioneered by Qian et al. [22]. To further improve the prediction accuracy and applicability of the GM (1, 1, t2) model, many scholars have made efforts. For example, Luo and Wei [23] constructed the expectation function by using the sum of error squares and solved the expression of the optimal constant in the time response function. Considering that the background values are key factor that affects the prediction accuracy of the GM (1, 1, t2) model, Wei et al. [24] used the linear interpolation method to reconstruct the background values of the GM (1, 1, t2) model. Although these studies have improved the prediction accuracy of the GM (1, 1, t2) model in the field of equidistant time series, to the best of our knowledge, there is a dearth of research on nonequidistant time series. However, in our real world, there exist a large number of nonequidistant time series, such as samplings in reaction furnaces and building settlements. On this basis, this paper develops a novel nonequidistant grey model by combining the existing GM (1, 1, t2) model and concept of nonequidistant time series. To further enhance the prediction performance of the nonequidistant GM (1, 1, t2) model, a Simpson formula is applied to optimize the background value of the nonequidistant GM (1, 1, t2) model; as a result, an improved nonequidistant GM (1, 1, t2) model (denoted as INEGM (1, 1, t2)) is proposed in this paper. The main contributions of this paper are summarized as follows:
A conventional nonequidistant GM (1, 1, t2) model is constructed by incorporating the concept of nonequidistant time series into the GM (1, 1, t2) model already in place
Simpson formula is applied to optimize the background value of the conventional nonequidistant GM (1, 1, t2) model to increase the prediction performance
Two real-world cases are used to verify the validity and superiority of the proposed model in comparison with other benchmark models
The rest of this paper is organized as follows. Section 2 introduces the conventional nonequidistant GM (1, 1, t2) model and analyzes the discretization error. Section 3 optimizes the nonequidistant GM (1, 1, t2) model by using a Simpson formula. Section 4 verifies the applicability of the proposed model and Section 5 concludes the paper.
2. Classic Nonequidistant GM (1, 1, t2) Model and Its Error Analysis2.1. Nonequidistant GM (1, 1, t2) Model
In accordance with the description in [25], it is easy to establish the nonequidistant GM (1, 1, t2) model (denoted as NEGM (1, 1, t2)), whose modeling process can be outlined as follows.
Step 1.
Suppose that the original time series is X0=x0k1,x0k2,…,x0kn, where Δki=ki−ki−1≠const,i=2,3,…,n; then X1=x1k1,x1k2,…,x1kn denotes the first-order accumulated generating operation sequence of X0, where(1)x1ki=∑j=1ix0kj,i=1,2,…,n.
Step 2.
The differential equation of NEGM (1, 1, t2) model is expressed as(2)dx1tdt+ax1t=bt2+ct+d.
Then, we get the discrete formula of equation (2) expressed as(3)x1ki−x1ki−1+az1ki=ki3−ki−13b3+ki2−ki−12c2+dΔki,where z1ki is the background value and z1ki=0.5x1ki+x1ki−1,k=2,3,…,n.
Step 3.
The model parameters can be calculated as(4)a,b,c,dT=BTB−1BTY,where(5)B=−z1k2Δk213k23−k1312k22−k12Δk2−z1k3Δk313k33−k2312k32−k22Δk3⋮⋮⋮⋮−z1knΔkn13kn3−kn−1312kn2−kn−12Δkn,Y=x0k2Δk2,x0k3Δk3,…,x0knΔknT.
Step 4.
The time response function of equation (2) is expressed as(6)x˜1ki=e−aki−k1x1k1−bak12+ac−2ba2k1+2b−ac+a2da3=baki2+ac−2ba2ki+2b−ac+a2da3.
Step 5.
The predicted values of the original sequence are expressed as(7)x˜0ki=x˜1ki−x˜1ki−1Δki,i=2,3,…,x0ki,i=1.
2.2. Error Analysis
Having reviewed the above modeling procedure, it is noticed that the prediction accuracy of NEGM (1, 1, t2) model depends on the model parameters that are closely related to the background value; thus the background value is essential for the prediction precision. In the classic NEGM (1, 1, t2) model, both sides of equation (2) are integrated over the interval ki−1,ki, expressed as(8)∫dx1dtdt+a∫x1dt=∫bt2+ct+ddt⇒x1ki−x1ki−1+a∫x1dt=ki3−ki−13b3+ki2−ki−12c2+dΔki.
We observe from equations (3) and (9) that the discretization method, so-called trapezoid formula, is employed to approximately calculate the integral ∫x1dt, thus producing the discretization error in such a transition process. To be specific, when the above integral has a concave trend, the approximate value is larger than the actual one; when the above integral has an upward convex trend, the approximate value is lower than the actual one.
3. Presentation of INEGM (1, 1, t2)3.1. Nonequidistant Simpson Numerical Integral
To decline the discretization error mentioned in Section 2.2, this section applies the concept of function approximation to calculate the area of the curved trapezoid. Since the integrable function is easily approached by the Lagrange function, we get a Lagrange polynomial ψx over the interval a≤x1<x2<⋯≤b denoted as Lnx, where(9)Lnx=∑j=1n+1ljxfxj,where ljx is the fundamental polynomial.
Definition 1.
(see [26]). If the n-order polynomial(10)ljx,j=0,1,…,n,
satisfies(11)ljx=1,k=j,0,k≠j,,j,k=0,1,…,n.
on nodes x0<x1<x2<⋯<xn, then l0x,l1x,…,lnx, are all interpolation basis functions on nodes x0,x1,x2,…,xn. Lagrange interpolation polynomial is expressed as(12)Lnx=∑k=0nfxkwn+1xx−xkwn+1′xk,where(13)wn+1x=x−x0x−x1⋯x−xn,wn+1′xk=xk−x0xk−x1⋯xk−xn.
Therefore,(14)fx≈∑j=1n+1ljxfxj.
By integrating both sides of equation (14), we obtain(15)Inf=∑j=1n+1Ajfxj,where Aj=∫ljxdx is the interpolation coefficient and Inf denotes the interpolation quadrature formula.
Theorem 1.
Assume that(16)∫ki−1ki+1x1tdt=Ii−1,i+1x1,and then the nonequidistant Simpson numerical integral formula is given as(17)Ii−1,i+1x1≈Δki+Δki+162Δki−Δki+1Δkixi−11+Δki+Δki+12ΔkiΔki+1xi1+2Δki+1−ΔkiΔki+1xi+11.
Proof.
By integrating x1t over the interval ki−1,ki+1,i=2,3,…,n−1, we have(18)Ii−1,i+1x1≈∑m=i−1i+1Amx1km=∑m=i−1i+1Amxm1.
Then, we get(19)lmk=∏j=1j≠mi+1k−kjk−mkj,Am=∫k−1k+1lmxdx.
When m=i−1, we have(20)Ai−1=∫ki−1ki+1li−1kdk=∫ki−1ki+1k−kik−ki+1ki−1−kiki−1−ki+1dk=Δki+Δki+12Δki−Δki+16Δki.
When m=i,m=i+1, we have(21)Ai=Δki+Δki+136ΔkiΔki+1,Ai+1=Δki+Δki+12Δki+1−Δki6Δki+1.
Then, we replace ∫ki−1ki+1x1tdt with the nonequidistant Simpson numerical integral formula, which yields that(26)x0ki+1Δki+1+x0kiΔki+aΔki+Δki+162Δki−Δki+1Δkixi−11+Δki+Δki+12ΔkiΔki+1xi1+2Δki+1−ΔkiΔki+1xi+11=b3ki+13−ki−13+c2ki+12−ki−12+dΔki+1+Δki.
Then, the model parameters can be obtained by the least-square method expressed as(27)a,b,c,dT=ϑTϑ−1ϑTω,where(28)ϑ=−I1,3x113k23−k1312k22−k12Δk2+Δk3−I2,4x113k33−k2312k32−k22Δk3+Δk4⋮⋮⋮⋮−In−2,nx113kn3−kn−1312kn2−kn−12Δkn−1+Δkn,ω=x0k2Δk2+x0k3Δk3x0k3Δk3+x0k4Δk4⋮x0kn−1Δkn+x0knΔkn.
It is noticed that the methods for time response function and predicted values of the improved model are similar to the classic nonequidistant nonhomogeneous grey model (elaborated on in Section 2).
3.3. Evaluation Model Indicator
In this section, seven statistical metrics are introduced to examine the prediction performance of the proposed model, as shown in Table 1.
Error-value metrics of the prediction model.
Index
Formula
Absolute percentage error
APE=x^0k−x0k/x0k×100%
Mean absolute percentage error
MAPE=1/n∑k=1nx^0k−x0k/x0k×100%
Mean absolute error
MAE=1/n∑k=1nx^0k−x0k
Mean square error
MAE=1/n∑k=1nx^0k−x0k2
Root mean square percentage error
RMSPE=1/n∑k=1nx^0k−x0k/x0k2×100%
Index of agreement
IA=1−∑k=1nx^0k−x0k2/∑k=1nx¯−x0k+x^0k−x¯2
Correlation coefficient
R=Covx^0k,x0k/Varx^0kVarx0k
4. Numerical Experiment
In this section, two real-world examples are used to demonstrate the superiority of the proposed model in comparison with other benchmark models including the GM (1, 1), NGM (1, 1, k), and NEGM (1, 1, t2) models.
Case 1.
In this case, monitoring data of building’s settlement point, seen in Table 2, are considered as an example to examine the prediction accuracy of the proposed model. It is noticed that the first seven pieces of data are used for calibrating these competitive models and the remaining six pieces of data are used to validate the prediction performance.
In accordance with the current study and references herein, the time response functions of the competitive models are calculated as follows:
We observe from Tables 3 and 4 and Figure 1 that the predicted values of the NGM (1, 1, k) model are lower than the actual one, while the predicted values of the GM (1, 1) model are higher than the actual one. Meanwhile, the six indicators of the proposed model are better than those of other benchmarks as a whole; therefore, the proposed model has a better prediction performance than those of other competitive models in this experiment.
Monitoring data of building’s settlement point [27].
Time
Data
1
78.60
3
78.40
5
80.30
8
80.70
10
81.10
12
81.50
15
82.10
17
82.60
19
82.80
22
83.70
25
83.80
29
84.00
31
84.70
Simulative and predictive values by the competitive models in Case 1.
Time
Data
GM (1, 1)
NGM (1, 1, k)
NEGM (1, 1, t2)
INEGM (1, 1, t2)
Value
APE (%)
Value
APE (%)
Value
APE (%)
Value
APE (%)
In-sample
1
78.60
78.60
0.00
78.60
0.00
78.60
0.00
78.60
0.00
3
78.40
79.06
0.84
78.67
0.34
78.67
0.35
78.52
0.16
5
80.30
79.65
0.81
79.83
0.58
80.08
0.27
80.19
0.14
8
80.70
80.39
0.38
80.77
0.09
80.67
0.04
80.67
0.03
10
81.10
81.15
0.06
81.37
0.33
81.66
0.08
81.15
0.03
12
81.50
81.75
0.31
81.67
0.21
81.55
0.06
81.52
0.03
15
82.10
82.03
0.08
81.91
0.23
82.03
0.08
81.99
0.13
Out-of-sample
17
82.60
82.51
0.10
82.07
0.64
82.52
0.10
82.47
0.16
19
82.80
83.29
0.59
82.15
0.79
82.90
0.13
82.84
0.05
22
83.70
83.92
0.26
28.21
1.78
83.39
0.37
83.31
0.46
25
83.80
84.70
1.08
82.26
1.84
83.97
0.20
83.88
0.10
29
84.00
85.65
1.97
82.28
2.04
84.64
0.77
84.54
0.64
31
84.70
86.78
2.46
82.30
2.84
85.22
0.62
85.11
0.48
Errors by the competitive models in Case 1.
Index
GM (1, 1)
NGM (1, 1, k)
NEGM (1, 1, t2)
INEGM (1, 1, t2)
In-sample
MAPE (%)
0.4129
0.2960
0.1456
0.0904
RMSPE (%)
0.0027
0.0011
0.0035
0.0001
MAE
0.3301
0.2382
0.1163
0.0726
MSE
0.1701
0.0716
0.0222
0.0070
IA
0.9675
0.9873
0.9960
0.9987
R
0.9564
0.9819
0.9951
0.9986
Out-of-sample
MAPE (%)
1.0750
1.6552
0.3643
0.3152
RMSPE (%)
0.0192
0.0330
0.0020
0.0015
MAE
0.9039
1.3891
0.3060
0.2646
MSE
1.3616
2.3374
0.1390
0.1051
IA
0.7427
0.4344
0.9495
0.9604
R
0.9618
0.9437
0.9599
0.9599
Errors by the competitive models in Case 1: (a) MAPE, (b) RMSPE, (c) MAE, (d) MSE, (e) IA, and (f) R.
Case 2.
This case takes an example collected from the literature [28] to illustrate the applicability of the proposed model, as shown in Table 5. Similar to Case 1, modeling data is divided into two groups: the first six data sets are applied for model calibration and the remaining four data sets are used to demonstrate the prediction performance of the proposed model.
The time response functions of these competitors are calculated as follows:
After a simple calculation, the predicted values of the original sequence by the four competitive models are tabulated in Table 6, and the relevant error-value metrics are listed in Table 7; for the intuition purpose, these values in Table 7 are plotted in Figure 2.
We know from Table 7 and Figure 2 that, in the simulation period, the indicators of the proposed model, except the correlation coefficient, are better than those of other benchmarks including the GM (1, 1), NGM (1, 1, k), and NEGM (1, 1, t2) models. Meanwhile, in the prediction period, the indicators of the proposed model, namely MAPE, RMSPE, MAE, MSE, and IA, are 1.6697%, 0.0440%, 0.2208, 0.0758, and 0.9065, respectively, which are better than the others. This fact indicates that the proposed model has a better prediction performance in this experiment.
Building’s settlement data in Case 2.
Time
Data
1
9.28
25
10.71
53
11.31
83
11.64
116
12.00
147
12.23
177
13.05
237
13.16
269
13.61
355
13.94
Simulative and predictive values by the competitive models in Case 2.
Time
Data
GM (1, 1)
NGM (1, 1, k)
NEGM (1, 1, t2)
INEGM (1, 1, t2)
Value
APE (%)
Value
APE (%)
Value
APE (%)
Value
APE (%)
In-sample
1
9.28
9.28
0.00
9.28
0.00
9.28
0.00
9.28
0.00
25
10.71
10.88
1.59
10.75
0.37
10.75
0.37
10.72
0.56
53
11.31
11.19
1.06
11.27
0.35
1.30
0.09
11.30
0.11
83
11.64
11.54
0.86
11.68
0.34
11.67
0.26
11.66
0.16
116
12.00
11.93
0.58
12.00
0.00
11.97
0.25
12.28
0.39
147
12.23
12.34
0.90
12.22
0.08
12.24
0.08
12.28
0.39
Out-of-sample
177
13.05
12.75
2.30
12.36
5.29
12.48
5.29
12.56
3.72
237
13.16
13.38
1.67
12.50
5.02
12.83
5.02
12.99
1.31
269
13.61
14.04
3.16
12.58
7.57
13.18
7.57
13.42
1.40
355
13.94
14.96
7.32
12.63
9.40
13.63
9.40
13.97
0.24
Errors by the competitive models in Case 2.
Index
GM (1, 1)
NGM (1, 1, k)
NEGM (1, 1, t2)
INEGM (1, 1, t2)
In-sample
MAPE (%)
0.9980
0.2305
0.2103
0.1861
RMSPE (%)
0.0111
0.0008
0.0006
0.0005
MAE
0.1140
0.0260
0.0240
0.0221
MSE
0.0141
0.0010
0.0007
0.0007
IA
0.9916
0.9994
0.9996
0.9996
R
0.9940
0.9996
0.9997
0.9997
Out-of-sample
MAPE (%)
3.6118
6.8170
3.0647
1.6697
RMSPE (%)
0.1790
0.4967
0.1007
0.2208
MAE
0.4925
0.9225
0.4100
0.0758
MSE
0.3409
0.9222
0.7818
0.0758
IA
0.7710
0.4260
0.7818
0.9065
R
0.9823
0.9142
0.9807
0.9815
Errors by the competitive models in Case 2: (a) MAPE, (b) RMSPE, (c) MAE, (d) MSE, (e) IA, and (f) R.
5. Conclusion
Establishing a proper model for nonequidistant time-series forecasting has always been an issue that puzzles many scholars. In order to further broaden the development of nonequidistant grey prediction models, this paper develops a nonequidistant GM (1, 1, t2) model (abbreviated as NEGM (1, 1, t2)) based on the previous literature.
To further improve the prediction accuracy of the NEGM (1, 1, t2) model, the modeling mechanism of the NEGM (1, 1, t2) model is deeply analyzed; it is known that, in the NEGM (1, 1, t2) model, the trapezoidal formula is used to calculate the area of curved trapezoid, thus producing an unacceptable error in such a transition process. To this end, we use the Simpson formula to optimize the NEGM (1, 1, t2) model based on the idea of function approximation and establish an improved NEGM (1, 1, t2) model based on Simpson formula (abbreviated as INEGM (1, 1, t2)). To verify the feasibility and validity of the INEGM (1, 1, t2) model, the INEGM (1, 1, t2) model and three other grey prediction models are applied to two published cases. The results show that the accuracy of the INEGM (1, 1, t2) model is higher than those of the NEGM (1, 1, t2) model and two other prediction models. Therefore, the feasibility and validity of the INEGM (1, 1, t2) model proposed in this paper are verified.
Although the advantages of the proposed model have been discussed, there are some limitations of the proposed model which should be considered in the following work; for example, this model is among the building blocks of the univariate model, potentially neglecting the relevant factors in practical applications. Meanwhile, combining the existing nonequidistant grey model with intelligent techniques merits further research.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by Zhejiang College of Shanghai University of Finance and Economics for scientific research projects at the provincial and above levels and by the Academic Program of the Association for Science and Technology of Jinhua (201920).
ZhouW.WuX.DingS.PanJ.Application of a novel discrete grey model for forecasting natural gas consumption: a case study of Jiangsu Province in China202020011744310.1016/j.energy.2020.117443XieW.WuW. Z.LiuC.ZhaoJ.Forecasting annual electricity consumption in China by employing a conformable fractional grey model in opposite direction202020211768210.1016/j.energy.2020.117682YaoF.MüllerH. G.WangJ. L.Functional linear regression analysis for longitudinal data20053328732903ŠkrjancI.MatkoD.Fuzzy predictive functional control in the state space domain2001311283297HosmerD. W.JrLemeshowS.SturdivantR. X.2013398Hoboken, NY, USAJohn Wiley & SonsCoxD. R.IshamV.A simple spatial-temporal model of rainfall19884151849317328GabrielK. R.NeumannJ.A Markov chain model for daily rainfall occurrence at Tel Aviv196288375909510.1002/qj.497088375112-s2.0-78449276408GaoS.ZhouM.WangY.ChengJ.YachiH.WangJ.Dendritic neuron model with effective learning algorithms for classification approximation and prediction2018302601614ZhouT.GaoS.WangJ.ChuC.TodoY.TangZ.Financial time series prediction using a dendritic neuron model201610521422410.1016/j.knosys.2016.05.0312-s2.0-84973131479SuykensJ. A. K.VandewalleJ.Least squares support vector machine classifiers19999329330010.1023/a:10186286097422-s2.0-0032638628BiamonteJ.WittekP.PancottiN.RebentrostP.WiebeN.LloydS.Quantum machine learning2017549767119520210.1038/nature234742-s2.0-85030752361ZhengC.WuW. Z.XieW.LiQ.ZhangT.Forecasting the hydroelectricity consumption of China by using a novel unbiased nonlinear grey Bernoulli model202127812390310.1016/j.jclepro.2020.123903XieW.WuW.-Z.ZhangT.LiQ.An optimized conformable fractional non-homogeneous gray model and its application202011610.1080/03610918.2020.1788588WangZ.-X.YeD.-J.Forecasting Chinese carbon emissions from fossil energy consumption using non-linear grey multivariable models201714260061210.1016/j.jclepro.2016.08.0672-s2.0-84994634471WeiB.-l.XieN.-m.YangY.-j.Data-based structure selection for unified discrete grey prediction model201913626427510.1016/j.eswa.2019.06.0532-s2.0-85067884765MaX.MeiX.WuW.WuX.ZengB.A novel fractional time delayed grey model with Grey Wolf Optimizer and its applications in forecasting the natural gas and coal consumption in Chongqing China201917848750710.1016/j.energy.2019.04.0962-s2.0-85065047460WuW.MaX.ZhangY.LiW.WangY.A novel conformable fractional non-homogeneous grey model for forecasting carbon dioxide emissions of BRICS countries202070713544710.1016/j.scitotenv.2019.135447XieM.WuL.LiB.LiZ.A novel hybrid multivariate nonlinear grey model for forecasting the traffic-related emissions2020771242125410.1016/j.apm.2019.09.013ŞahinU.Forecasting of Turkey's greenhouse gas emissions using linear and nonlinear rolling metabolic grey model based on optimization2019239118079LiuL.ChenY.WuL.The damping accumulated grey model and its application20219510566510.1016/j.cnsns.2020.105665Ju-LongD.Control problems of grey systems19821528829410.1016/s0167-6911(82)80025-x2-s2.0-50849151631QianW. Y.DangY. G.LiuS. F.Grey GM (1 1 tα) model with time power and its application2012321022472252DangL.BaoleiW.Grey forecasting model with polynomial term and its optimization20172935869WeiB.XieN.HuA.Optimal solution for novel grey polynomial prediction model20186271772710.1016/j.apm.2018.06.0352-s2.0-85049486780CuiJ.DangY. G.LiuS. F.Novel grey forecasting model and its modeling mechanism2009241117021706HildebrandF. B.1987Chelmsford, MA, USACourier CorporationYongleiZ.XiufengH.Improved discrete grey prediction model of ground settlement around excavation20132XiL.SongD.XuN.XiongP. P.Research on optimization of non-equidistant GM (1 1) model based on the principle of new information priority20193422212228