The grey dynamic model by convolution integral with the first-order derivative of the 1-AGO data and n series related, abbreviated as GDMC(1,n), performs well in modelling and forecasting of a grey system. To improve the modelling accuracy of GDMC(1,n), n interpolation coefficients (taken as unknown parameters) are introduced into the background values of the n variables. The parameters optimization is formulated as a combinatorial optimization problem and is solved collectively using the particle swarm optimization algorithm. The optimized result has been verified by a case study of the economic output of high-tech industry in China. Comparisons of the obtained modelling results from the optimized GDMC(1,n) model with the traditional one demonstrate that the optimal algorithm is a good alternative for parameters optimization of the GDMC(1,n) model. The modelling results can assist the government in developing future policies regarding high-tech industry management.
1. Introduction
Grey system theory [1, 2], a rising interdiscipline in uncertainty, takes the uncertain systems of “small samples and incomplete information” with “partial known information and partial unknown information” as the research object. This theory extracts valuable information mainly by generating and developing the “partial” known information. Then, the accurate demonstration of the system running behavior and the evolvement rule is done and effective monitoring is achieved. Grey system theory can be used like this to deal with the practical problems in a situation where the sample data size is not large enough to demonstrate the statistical law or the deterministic law of the system behavior.
At present, the most widely used grey model GM(1,1) [1, 2]—a first-order one-variable grey differential equation—is proposed based on the aforementioned principle. Its modelling principle does not depend on distribution information from the raw data but on the application of a first-order accumulative generation operation (1-AGO), to make the sequence display grey exponent law behavior as a whole. Based on this, a first-order grey differential equation is constructed and solved. The forecast values are then derived from the first-order inverse accumulative generating operation (1-IAGO). Due to the fact that the construction of GM(1,1) does not require a large sample and it is easy to build and calculate, GM(1,1) and its improved variant models have been widely used [3–6].
The GM(1,n) model [1], with n-1 relative variables acting as an associated series besides the predicted series, is a basic grey multivariable model. The forecasts of a time series X1(0) may be considerably improved by using information coming from some associated series Xi(0), i=2,3,…,n. This is particularly true if changes in X1(0) tend to be anticipated by changes in Xi(0), i=2,3,…,n. Nevertheless, the solution of the whitening differential equation for GM(1,n) is rough [7]—it can easily result in large errors in actual forecasting applications. Thus, for a long time, this model has been little used. In fact, the only existing research applications have proceeded on the basis of improved models [8–10]. The grey model with convolution integral, GMC(1,n), has been proposed by Tien [11]. This is a new model which seeks to improve the traditional GM(1,n) model. Theoretically, the modelling values produced by GMC(1,n) are the exact solution of the traditional GM(1,n) model and the grey control parameter (similar to that of the GM(1,1) model) introduced into the model. When the number of variables in the model n=1, the GMC(1,n) model reduces to GM(1,1). Using GMC(1,n) greatly improves the forecasting accuracy of the multivariable grey model and has been successfully applied to various different fields [11–13]. The multivariable grey model GDMC(1,n) [14] is based on the model GMC(1,n). The higher frequency components of the 1-AGO data of the associated series were significant for the grey prediction system and the first-order derivatives of 1-AGO data of the associated series were added into the GDMC(1,n) model.
In this study, n interpolation coefficients are introduced into the background values of the variables in GDMC(1,n) model. Then, by aiming to minimize the modelling error, the optimal parameters are solved using the particle swarm optimization algorithm. Thereby, the adaptability and forecasting accuracy of GDMC(1,n) to real data are enhanced. The remainder of this paper is organized as follows. Section 2 explains the modelling methods, including the traditional GDMC(1,n) and the optimization method. In Section 3, the traditional GDMC(1,n) and the optimized GDMC(1,n) are applied to forecast the output of high-tech industry in China. Finally, conclusions are presented in Section 4.
2. Methodology2.1. The Representation of the Grey Dynamic Model with Convolution Integral
Suppose that pairs of observations X(0)=(X1(0),X2(0),…,Xn(0)) are available at equispaced time intervals consisting of n-1 inputs X2(0),X3(0),…,Xn(0) and an output X1(0) from some dynamic system. The existing GDMC(1,n) modelling process [14] is carried out as follows. Consider the original predicted series
(1)X1(0)={X1(0)(rp+1),X1(0)(rp+2),…,X1(0)(rp+r)}
and the original associated series
(2)Xi(0)={Xi(0)(1),Xi(0)(2),…,Xi(0)(r)},i=2,3,…,n.
Then the 1-AGO data for X1(0),X2(0),…,Xn(0) are given by the following equations, respectively:
(3)X1(1)(rp+t)=∑j=1tX1(0)(rp+j),t=1,2,…,r,(4)Xi(1)(t)=∑j=1tXi(0)(j),t=1,2,…,r,j=2,3,…,n.
The grey forecasting model based on the predicted 1-AGO series
(5)X1(1)={X1(1)(rp+1),X1(1)(rp+2),…,X1(1)(rp+r)}
and the associated 1-AGO series
(6)Xi(1)={Xi(1)(1),Xi(1)(2),…,Xi(1)(r)},i=2,3,…,n,
is given by the following differential equation:
(7)dX1(1)(rp+t)dt+b1X1(1)(rp+t)=b2+∑i=2n[b2i-1dXi(1)(t)dt+b2iXi(1)(t)],
where b1 and b2 are the developmental coefficient and the grey control parameter, respectively, and (b2i-1,b2i) are the associated coefficients corresponding to the associated series Xi(0), i=2,3,…,n, respectively, r is the number of entries for model building, rp is a period of delay, and b1,b2,…,b2n are model parameters to be estimated.
Equation (7) is called the grey dynamic model by convolution integral with the first-order derivative of the 1-AGO data and n series related, abbreviated as GDMC(1,n)—the “1” represents the first-order derivative of the 1-AGO series of X1(1) and the “n” represents the total of n relative series introduced into the grey differential equation.
The grey derivative for the first-order grey differential equation with 1-AGO is conventionally represented as
(8)dX1(1)(rp+t)dt=limΔt→0X1(1)(rp+t)-X1(1)(rp+t-Δt)Δt.
When Δt→1,
(9)dX1(1)(rp+t)dt=ΔX1(1)(rp+t)Δt=X1(1)(rp+t)-X1(1)(rp+t-1)=X1(0)(rp+t).
The background value of the grey derivative dX1(1)(rp+t)/dt is taken as the mean of X1(1)(rp+t) and X1(1)(rp+t-1). Those of the associated series Xi(1)(t) are also taken as the mean of Xi(1)(t) and Xi(1)(t-1) for i=2,3,…,n, respectively, for the determination of the model parameters in GDMC(1,n).
The least-squares solution for the model parameters in the GMC(1,n) in (7) for t from 1 to r is given by
(10)(b1,b2,…,b2n)T=(BTB)-1BTYN,
where(11)B=[-W1(rp+1)1X2(0)(1)W2(1)X3(0)(1)W2(1)⋯Xn(0)(1)Wn(1)-W1(rp+2)1X2(0)(2)W2(2)X3(0)(2)W3(2)⋯Xn(0)(2)Wn(2)⋮⋮⋮⋮⋮⋮⋯⋮⋮-W1(rp+r)1X2(0)(r)W2(r)X3(0)(r)W3(r)⋯Xn(0)(r)Wn(r)],Wi(t)=12(Xi(1)(t)+Xi(1)(t-1))fori=1,2,…,n,YN=(X1(0)(rp+2),X1(0)(rp+3),…,X1(0)(rp+r))T.
In summary, in the right-hand side of (7), the discrete function f(t) can be obtained as
(12)f(t)=b2+∑i=2n[b2i-1dXi(1)(t)dt+b2iXi(1)(t)],b2i-1dXi(1)(t)dt+b2iXi(1)(t)t=0,1,…,r+rf.
The 1-AGO modelling values of the predicted series can be derived using the initial condition X^1(1)(rp+1)=X1(1)(rp+1) as
(13)X^1(1)(rp+t)=X1(1)(rp+1)e-b1(t-1)+∫1te-b1(t-τ)f(τ)dτ.
The second term of the right-hand side in (13) can be evaluated approximately by the two-point Gauss numerical integration [14] with the linear assumption on f(t) between any two neighboring times; we have
(14)X^1(1)(rp+1)=X1(1)(rp+1)=X1(0)(rp+1),(15)X^1(1)(rp+t)=X1(0)(rp+1)e-b1(t-1)+u(t-2)×∑τ=2t{12ω1e-b1(t+0.5-τ+0.5λ1)×[0.5(f(τ)+f(τ-1))+0.5λ1(f(τ)-f(τ-1))],+12ω2e-b1(t+0.5-τ+0.5λ2)×[0.5(f(τ)+f(τ-1))12ω1e-b1(t+0.5-τ+0.5λ1)+0.5λ2(f(τ)-f(τ-1))]},t=2,3,…,r+rf,
where the coefficients ω1 and ω2 are both equal to 1, the nodes λ1 and λ2 are equal to -1/3 and 1/3, respectively, and u(t-2) is the unit step function. Applying 1-IAGO to (15) yields the following modelled values together with the forecasts:
(16)X^1(1)(rp+1)=X1(1)(rp+1)=X1(0)(rp+1),X^1(0)(rp+t)=X^1(1)(rp+t)-X^1(1)(rp+t-1),X^1(0)(rp+t)=X^1(1)(rp+tt)t=2,3,…,r+rf,
where rf is the number of entries to be forecasted or indirectly measured.
Assuming the system parameters in (7) to be constants in the postsampling period, then, by using the postsampling data combined with the data given for the corresponding associated series as a new input series, the corresponding forecasts or values of indirect measurement for the predicted series can be derived.
It is obvious that, when the number of associated series is zero, that is, when n=1, (7) reduces to the grey single variable forecasting model GM(1,1).
2.4. Evaluation of the Modelling and Forecasting Accuracy
To evaluate the accuracy of the grey models Tien [14], who proposed the GDMC(1,n) model, applied the root mean squared percentage error (RMSPE) to the priori sample period (RMSPEPR) and postsample period (RMSPEPO), respectively. Generally, the RMSPEPR and RMSPEPO are defined, respectively, as
(17)RMSPEPR=1r∑t=rp+1rp+r[X^1(0)(t)-X1(0)(t)]2[X1(0)(t)]2×100%,(18)RMSPEPO=1rf∑t=rp+r+1rp+r+rf[X^1(0)(t)-X1(0)(t)]2[X1(0)(t)]2×100%.
2.5. Optimization of the GDMC<inline-formula>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M99">
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:math>
</inline-formula> Model Based on PSO Algorithm
In this study, to enhance the modelling and forecasting accuracy of the GDMC(1,n) model, n interpolation coefficients ρi∈[0,1] (i=1,2,…,n) are introduced into the background values of each of the variables in GDMC(1,n). Then the optimal ρi, i=1,2,…,n, are calculated with the objective of minimizing RMSPEPR.
Based on the above method, the background value of the grey derivative dX1(1)(rp+t)/dt is taken as the weighted mean of X1(1)(rp+t) and X1(1)(rp+t-1); namely, W1(t)=ρ1X1(1)(rp+t)+(1-ρ1)X1(1)(rp+t-1). Those of the associated series Xi(1)(t) are also taken as the weighted means of Xi(1)(t) and Xi(1)(t-1); that is, Wi(t)=ρiXi(1)(t)+(1-ρi)Xi(1)(t-1) for i=2,3,…,n in the determination of the model parameters in GDMC(1,n). When ρ1=ρ2=⋯=ρn=1/2, the optimized GDMC(1,n) model is reduced to a traditional GDMC(1,n) model.
The particle swarm optimization (PSO) algorithm [15] is a population-based heuristic algorithm that simulates the social behavior as birds flocking to a promising position to achieve precise objectives in a multidimensional space. Each particle is a potential solution to the optimization problem. A particle represents a point in an n-dimensional space, and its status is characterized through its position and velocity. The position for the particle i can be represented as Xi=(Xi1,Xi2,…,Xin). The velocity of this particle can be represented by another n-dimensional vector vi=(vi1,vi2,…,vin). The best previous position of the ith particle can be represented as Pi=(Pi1,Pi2,…,Pin). The best position in the entire swarm is denoted as Pg=(Pg1,Pg2,…,Pgn). To search for the optimal solution, each particle changes its velocity and position according to the following two formulas:
(19)vij(t+1)=ωvij(t)+c1rand(·)[Pij(t)-Xij(t)]+c2rand(·)[Pgj(t)-Xij(t)],Xij(t+1)=Xij(t)+vij(t+1),
where t means the number of iterations. ω is called the inertia weight that controls the impact of previous velocity of particle on its current one. In a general way, we let ωmin=0.4, ωmax=0.9. c1 and c2 are positive constant parameters called acceleration factors which control the maximum step size. Typical values for c1 and c2 are c1=c2=2. rand(·) is a random number between zero and one.
To avoid the phenomenon of “shock” when particles near global optimal solution, we can employ a linear gradient strategy for the inertia weight:
(20)ω=ωmax-ωmax-ωmintmax·t,
where tmax is the total number of iterations.
This study uses the PSO algorithm to solve the optimal background value coefficients for GDMC(1,n) model. The specific steps are as follows.
Step 1.
Generate m particles randomly in the n-dimensional space. The position and velocity for the particle i can be represented as
(21)ρi=(ρi1,ρi2,…,ρin),vi=(vi1,vi2,…,vin),ρi=(ρi1,ρi2,…,ρin),vivivivi=1,2,…,m.
Step 2.
Set the initial position of the particles as ρ1=(0,0,…,0) and generate an initial velocity vector v1 randomly in the interval (0,1.1].
Step 3.
According to (10), substitute ρi into the matrix B and establish the fitness function
(22)F(ρi)=1r∑t=rp+1rp+r[X^1(0)(t)-X1(0)(t)]2[X1(0)(t)]2.
Step 4.
Calculate the fitness value for each particle.
Step 5.
Compare each particle’s fitness to the global best position pbest to adapt to the value; if it is the optimum, it can be taken as the best position; if not, turn to Step 6.
Step 6.
Update particles velocity and position according to the evolution equation according to (19).
Step 7.
If stopping criteria are met, show the output and its fitness, which corresponds to the optimal parameters and the RMSPEPR, or else go back to Step 3.
3. Forecasting the Output of High-Tech Industry in China
Forecasting the output of high-tech industry is essential for the development of projects and policy making. As Chinese high-tech industry is neonatal and so the data relating to existing economic indices are limited, it is difficult to apply existing statistical methods of analysis and forecasting to them. As a result, little research has been conducted on quantitative forecasting of Chinese high-tech industry. In this section, the advantage of the optimized GDMC(1,n) model (abbreviated as OGDMC(1,n)) over the traditional one is demonstrated by a real case study of high-tech industry in China.
3.1. Variables and Data
In economics theory, human resources and capital investment are the crucial factors of the economic system output [16]. The economic output of high-tech industry is taken as the predicted series and is denoted as X1. And the annual average employment and investment are adopted as the relative variables of the output of high-tech industry and are denoted as X2 and X3, respectively. The original data relating to the output, average employment, and investment of the high-tech industry in China for 2003–2010 are shown in Table 1. These data are all collected from the China Statistics Yearbook on High Technology Industry (2004–2011).
Industrial output value, average employment, and investment of high-tech industry in China.
In the following, the industrial output value X1 is used as the forecasting variable. At the same time, with the average employment X2 and the investment X3 as relative variables, multivariable forecasting models GDMC(1,3) and OGDMC(1,3) models are established. The data of X1, X2, and X3 from 2003 to 2010 is employed as the original modelling series to construct GDMC(1,3) and OGDMC(1,3) models.
Applying the GDMC(1,3) model given by (7)–(16), the values of the parameters n, r, and rp in (7) and the estimates of the model parameters b1, b2, and (b2i-1,b2i) in (10) can be obtained (Table 2). The resulting GDMC(1,2) model from (7) has the form
(23)dX1(1)(t)dt+0.10822X1(1)(t)=-5083.77+23.28dX2(1)(t)dt+17.78X2(1)(t)+9.30dX3(1)(t)dt-3.20X3(1)(t),t=1,2,…,8.
The values of parameters n, r, and rp in (7), the estimates of model parameters b1, b2, and (b2i-1,b2i) in (10), and the value of RMSPEPR in (17).
Parameter
Value
Parameter
Value
n
3
b3
23.28
r
8
b4
17.78
rp
0
b5
9.30
b1
0.10822
b6
−3.20
b2
−5083.77
RMSPEPR (%)
7.07
To sum up the right-hand side of (7), the discrete function f(t) in (12) for the GMC(1,2) model is obtained and tabulated (Table 3). The values of RMSPEPR in (17), respectively, are also listed in Table 2.
Applying the OGDMC(1,3) model given by (7)–(16) and the optimized method, the values of the parameters n, r, and rp in (7), the estimates of the model parameters b1, b2, and (b2i-1,b2i) in (10), and the optimized parameters ρ1, ρ2, and ρ3 can be obtained (listed in Table 4). The resulting OGDMC(1,3) model from (7) has the form
(24)dX1(1)(t)dt+0.22310X1(1)(t)=-7021.86+34.90dX2(1)(t)dt+18.76X2(1)(t)+8.23dX3(1)(t)dt-1.68X3(1)(t),t=1,2,…,8.
The values of parameters n, r, rf, and rp in (7), the estimates of model parameters b1, b2, and (b2i-1,b2i) in (10), the optimized parameters ρ1, ρ2, and ρ3, and the values of RMSPEPR in (17).
Parameter
Value
Parameter
Value
n
3
b5
8.23
r
8
b6
−1.68
rp
0
ρ1
0.17624
b1
−0.34721
ρ2
0.14669
b2
−0.45756
ρ3
1
b3
34.90
RMSPEPR (%)
1.86
b4
18.76
To sum up the right-hand side of (7), the discrete function f(t) in (12) for the OGDMC(1,2) model is obtained and tabulated (Table 5). The values of RMSPEPR in (17) are also listed in Table 4.
The discrete function f(t) for OGDMC(1,3) in (12).
t
f(t)
t
f(t)
1
27896.61
5
93126.25
2
42762.85
6
113838.20
3
57161.09
7
129932.20
4
74385.06
8
160408.60
3.3. Evaluation of the Grey Forecasting Models
In this study, the most commonly used four measurements in grey theory, including mean relative error (MRE), absolute degree of grey incidence (ADGI), ratio of standard deviation (RSD), and probability of small error (PSE) [2, 17], are used to evaluate the accuracy of the models involved. The levels of accuracy and their critical values are given in Table 6 [17]. When all the measured values in a forecasting model meet the requirements of the critical values listed in the table, the model is applicable to the prediction. If they fall within the ranges of levels 1 and 2, this suggests that the model has high forecasting accuracy.
Levels of accuracy and their critical values.
Level
MRE
ADGI
RSD
PSE
1
0.01
0.90
0.35
0.95
2
0.05
0.80
0.50
0.80
3
0.10
0.70
0.65
0.70
4
0.20
0.60
0.80
0.60
3.4. Empirical Results and Discussion
The modelling results for China’s high-tech industry using the two grey models described above are shown in Table 7 together with the four accuracy measurements. The results show that the GDMC(1,3) model and the OGDMC(1,3) model fall within levels 2 and 1, respectively. The OGDMC(1,3) model reduces the MRE of the GDMC(1,3) model from 0.0217 to 0.0045, increases the ADGI of the GDMC(1,3) model from 0.94 to 1.00, and also reduces the RSD of the GDMC(1,3) model from 0.12 to 0.06. This indicates that the OGDMC(1,n) model can improve the modelling accuracy of the traditional model GDMC(1,n) significantly.
The modelling results obtained using the GDMC(1,3) and OGDMC(1,3) models.
Time (Year)
Actual value
GDMC(1,3)
OGDMC(1,3)
Modelling value
Relative error
Modelling value
Relative error
1 (2003)
20556
20556
0
20556
0
2 (2004)
27769
24823.94
0.0609
27308.16
0.0095
3 (2005)
34367
32072.61
0.0277
34970.11
0.0073
4 (2006)
41996
39183.89
0.0226
42101.73
0.0008
5 (2007)
50461
46668.65
0.0217
49775.32
0.0039
6 (2008)
57087
53891.31
0.0138
57470.03
0.0016
7 (2009)
60430
58551.09
0.0064
62549.36
0.0072
8 (2010)
74709
67230.30
0.0204
72731.92
0.0054
MRE
0.0217
0.0045
ADGI
0.94
1.00
RSD
0.12
0.06
PSE
1.00
1.00
Figure 1 demonstrates the degree of closeness between the modelling results of the models and the real values of the output of high-tech industry in China. Clearly, the model OGDMC(1,3) is closer to the actual data than the traditional one. Moreover, as can be seen from a histogram of the modelling residual errors (Figure 2), the residual error from the OGDMC(1,3) model is notably smaller than that of the GDMC(1,3). The reason for this lies in the fact that, through optimization of the background value, the modelling ability of the GDMC(1,3) model can be further improved in the OGDMC(1,3) model. The model coefficients b3, b4, b5, and b6 can reveal the role and importance of employment and investment and their impact on the output of high-tech industry.
The output value and the modelling curves obtained using the GDMC(1,3) and OGDMC(1,3) models.
The residual errors occurring in the GDMC(1,3) and OGDMC(1,3) models.
4. Conclusions
This study presents a PSO-algorithm-based optimization method for the grey dynamic model by convolution integral with the first-order derivative of the 1-AGO data and n series related. According to empirical modelling results of the high-tech industry in China, the modelling accuracy of the traditional GDMC(1,n) model can be effectively increased using a background value optimization method as proposed in this study. This is important in practice as the government of high-tech industry needs to make decisions based on traditional parameter solutions which may be constrained by an incorrect local optimal solution.
Due to the lack of more additional data published by China’s Statistic Department at present, the out-of-sample and the time-delayed forecasting results of this study have not fully been validated. Therefore, in future work, the out-of-sample and the time-delayed forecasting using both the GDMC(1,n) and the OGDMC(1,n) needs to be taken into account after more data is released by China’s Statistic Department. In addition, the grey forecasting model established in this study merely considers two basic input factors, that is, capital and labor, while ignoring the technical level. The main reason lies in the fact that sample data are rather limited in the short development of China’s high-tech industries. Moreover, it is challengeable to estimate the technology level of high-tech industries using small sample data. To address this problem, further researches are also needed.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors are grateful to the editors and the anonymous reviewers for their insightful comments and suggestions. The authors also thank the National Natural Science Foundation of China (Grant no. 71101132), the Philosophy and Social Science Foundation of Zhejiang Province, China (Grant no. 13ZJQN029YB), and the Academic Climbing Project for Young and Middle-Aged Leading Academic in the Universities of Zhejiang Province, China (Grant no. PD2013275), for financially supporting this study.
DengJ. L.LiuS. F.LinY.LiD.ChangC.ChenC.ChenW.Forecasting short-term electricity consumption using the adaptive grey-based approach—an Asian caseWangC.-H.HsuL.-C.Using genetic algorithms grey theory to forecast high technology industrial outputLuI. J.LewisC.LinS. J.The forecast of motor vehicle, energy demand and CO_{2} emission from Taiwan's road transportation sectorChangC. J.LiD. C.DaiW. L.ChenC. C.Utilizing an adaptive grey model for short-term time series forecasting: a case study of wafer-level packagingTienT.-L.A research on the grey prediction model GM(1,n)HsuL.Forecasting the output of integrated circuit industry using genetic algorithm based multivariable grey optimization modelsHsuL.WangC.Forecasting integrated circuit output using multivariate grey model and grey relational analysisLuoY. X.WuX.LiM.CaiA. H.Grey dynamic model GM(1,N) for the relationship of cost and variabilityTienT.The indirect measurement of tensile strength of material by the grey prediction model GMC(1,n)WuW.-Y.ChenS.-P.A prediction method using the grey model GMC(1,n) combined with the grey relational analysis a case study on internet access population forecastSuC.ChenC.LiuW.LaiH.Estimation for inner surface geometry of furnace wall using inverse process combined with grey prediction modelTienT.The indirect measurement of tensile strength for a higher temperature by the new model IGDMC(1,n)GaoS.YangJ. Y.LiZ. N.LiD.ChangC.ChenW.ChenC.An extended grey forecasting model for omnidirectional forecasting considering data gap difference