Application of Generalized Space-Time Autoregressive Model on GDP Data in West European Countries

This paper provides an application of generalized space-time autoregressive GSTAR model on GDP data in West European countries. Preliminary model is identified by space-time ACF and space-time PACF of the sample, and model parameters are estimated using the least square method. The forecast performance is evaluated using the mean of squared forecast errors MSFEs based on the last ten actual data. It is found that the preliminary model is GSTAR 2;1,1 . As a comparison, the estimation and the forecast performance are also applied to the GSTAR 1;1 model which has fewer parameter. The results showed that the ASFE of GSTAR 2;1,1 is smaller than that of the order 1;1 . However, the t-test value shows that the performance is significantly indifferent. Thus, due to the parsimony principle, the GSTAR 1;1 model might be considered as a forecasting model.


Introduction
Space-time data are frequently found in many areas of research, for example, monthly tea production from some plants, yearly housing price at capital cities, and yearly per capita GDP gross domestic product of several countries in some region.The generalized space-time autoregressive model of order p; λ 1 , . . ., λ p , shortened by GSTAR p; λ 1 ,. ..,λ p , is one of space-time models characterized by autoregressive terms lagged in the pth order in time and the order of p; λ 1 , . . ., λ p in space 1 .
The term of generalization is associated with the model parameters.When a parameter matrix is diagonal, the GSTAR model is the same as space time autoregressive STAR model given by Martin and Oeppen 2 and Pfeifer and Deutsch 3 .The notion of generalization has also been used by Terzi 4 who also generalizes STAR models but in a different context.He generalized the STAR 1;1 by adding the contemporaneous spatial correlation but still preserved the scalar parameters.
When p 1 and λ 1 1, GSTAR 1;1 is called the first order of GSTAR model.The model has interpretation that the current observation in a certain location only depends on the immediate past observations recorded at the location of interest and at its nearest neighbourhood 5 .The order 1;1 is the simplest natural assumption if one wants to forecast future observations in a certain location.The STAR 1;1 model is another simple space time model which also has the same interpretation as GSTAR 1;1 model.However, contrary to the generalization model, its parameters of each spatial order are assumed to be the same for all location though; there is no a priori justification for this assumption 1 .Parameters of GSTAR model can be estimated by the method of least square.This method has been used to model the monthly oil production 5 and to model the monthly tea production 1 .However, when the model was applied to their data, none of the papers included a description about how to assess the model based on forecasting performance which is an important step when the modelling purpose is to build a forecasting model.In this paper, we attempt to put in the idea to optimize the goodness of fit in model selection.
This paper is presented as follows.In Section 2, the GSTAR model and the parameters least squares estimation is reviewed with the example given for GSTAR 1;1 and GSTAR 2;1,1 .To illustrate the estimator properties for finite sample size, simulation study is discussed in Section 3. In the last section, the model is applied to the per capita GDP ratio data in West European countries for the period 1956-1996.The one step ahead forecasting is performed for each model for the period 1997-2006.As a comparison performance measure, it is used the empirical mean of squared forecast error MSFE where forecast error is defined as the difference between the actual value and the forecast value.

The Model
Let Z t Z 1 t , . . ., ZNt be an N-dimensional vector process with zero mean with N as is a fixed positive integer.GSTAR p; λ 1 , . . ., λ p process is a space-time process Z t which satisfies where p is the autoregressive order, λ k is the spatial order of the kth autoregressive term, W w ij is an N×N matrix of spatial weight for the spatial order which has a zero diagonal, sum of each row is equal to one, and matrix W 0 is defined as the identity matrix I.An N × N matrix Φ k is a diagonal parameter matrix of temporal lag k and spatial lag with the diagonal element φ . Finally, e t is an error vector at time t which is assumed to be independent normal with zero mean and constant variance.where

2.5
Example 2.2 LS estimation for GSTAR 2;1,1 model .For order p 2, λ 1 1 and λ 2 1, model 2.1 is called GSTAR 2;1,1 which can be written as By rearranging the component of Z t for t 2, 3, . . ., T, by location then by time, model 2.6 can also be expressed as a linear model where Z and e are defined as in Example 2.1, X diag X 1 , . . ., X N with and Φ φ .

Journal of Probability and Statistics
Then, the LS estimator for parameter matrix Φ is a solution of X X Φ X Z .2.9

Simulation Study
Under the stationary assumption, the LS estimator for the GSTAR parameters is a consistent estimator 1 .To get insight into the LS properties for finite sample we performed a Monte Carlo simulation with 1000 replications.Artificial data were generated from GSTAR 1;1 model where the error e t is normally distributed with mean 0 and covariance matrix I 4 .
Spatial weight matrix and model parameters that used in the simulation, respectively, were 2 .The result is presented in Table 1.It can be seen that the parameters estimates in average approaches the true parameters as T increases while the empirical MSE is getting smaller and approaching 0 as T increasing.It means that behavior of the LS estimator in the simulation exhibits the consistent property.In general, we can notice that the LS estimation could give fair estimates even for moderate sample size such as T 40 and T 50.

Application of GSTAR Model to the Ratio of Per Capita GDP Data
In this section, we apply the GSTAR model to the ratio of per capita GDP data in 16 West European countries.The data was kindly given by Maddison 6 from Faculty of Economics, University of Groningen, the Netherland, who passed away on April 24 2010.The per capita GDP of a country is the country GDP value divided by its population, and the per capita GDP of total West Europe is the sum of each West European country GDP divided by the total population in West Europe.The ratio of the per capita GDP of a country is the country per capita GDP value divided by the per capita GDP of total West Europe, multiplied by 100.Hence, the unit of the per capita GDP ratio is the percentage.For the data analysis in the following subsection, we will use the ratio of the per capita GDP data and for simplicity it will be called the GDP ratio data.

Dataset and Preliminary Model Building
The dataset is the GDP ratio data for periods 1955-2006.It consists of 52 observations of 16 dimensional vectors.For the purpose of forecasting the data was grouped into the training data set and test data set.The training data is the first 42 observations that will be used for model building and the test data is the last ten data that will be used in forecasting performance comparison.
Clearly, the 42 observations in the training data, depicted in Figure 1 a , are not stationary though they tend to converge to the value between 50% and 150%.Therefore, to achieve the zero mean stationary data, the first difference transformation and data centralization must be applied.
Suppose Y i t represents the per capita GDP ratio for country i, i 1, 2, . . ., 16 at time t, t −1, 0, 1, . . ., 50.The first difference transformation of Y i t denoted by D i t is and the centralization of the differenced data is where D i is the average of Z i t at location i, for i 1, 2, . . ., 16. Plot of the 16 series of this transformation is displayed in Figure 1 b and their behaviour has seemed to represent stationary series.
As a preliminary model building, we set some notations used in model 2.1 .The length of time period is T 40, and the number of sites is N 16.Time period t 0 will correspond to 1956, t 1 to 1957, t 2 to 1958, and so on.Since the time dimension is one, the time lag can be ordered naturally by the sequence of k 1, 2, . ... On the other hand, the spatial order may be defined in a different ways because in a two-dimensional space there is no specific order just as in one-dimensional space.For the GDP data, there are 16 countries.The first order and the second order neighbours of the countries are given in Table 2.These are defined based on the geographical location of the countries.The first order neighbours of a country are those which have a common border with the country or within a close distance along a sea route.A second order neighbour of a country is the union of all first order neighbourhood countries of its first order neighbours, excluding itself.Suppose n i , 1, 2 represents the number of countries which are the th neighbours of a country i.The spatial weight of order between countries i and j can be defined as , if j is the th neighbours of i, 0, if j is not the th neighbours of i.

4.3
For example, from Table 2 it can be seen that Austria has 3 the first order neighbours, Germany, Netherland and Switzerland, and has 6 second order neighbours, Belgium, Denmark, France, Greece, Netherland, and Sweden.Then, the first order of spatial weight between Austria and each nearby country is

GSTAR Model Building for the GDP Data
After transforming the data and constructing spatial weight matrix, the next step is identification of the model order.In STAR model-building, 3 used the sample space time autocorrelation function STACF and space time partial autocorrelation function STPACF as the primary tools for model identification.The order p; λ 1 , . . ., λ p model can be characterized by the tail-off behaviour of the autocorrelations and the cut-off behaviour of the partial after p time lag and λ p spatial lag.Since STAR model is a special case of GSTAR model, these autocorrelation functions will be adopted to identify the order of GSTAR model.Figure 2 presents sample STACF and STPACF for the differenced data Z t up to time lag 10 and spatial lags 0, 1, and 2. The pattern is not clearly suggested an exact order.However, since the sample STPACF cut off after time lag 2 and spatial lag 1, order 2;1,1 can be considered as the space-time order candidate.In addition, the space-time partials also cut off at time lag 5 and spatial 2, but applying this GSTAR model to the data will result too many estimated parameters because there will be at least 160 parameters which have to be estimated.

Journal of Probability and Statistics
The 64 parameters and the error variance in this model were estimated using the least square method and the result is presented in Table 3.The empirical MSE for GSTAR 2;1,1 model is 2.735 counted based on 16 × 39 or 624 values.The residuals histogram and normal probability plot on Figures 4 a and 4 b show that the GSTAR 2;1,1 residuals are approximately normal distributed with zero mean and constant variance.Meanwhile, fitted value versus residuals plot in Figure 4 c exhibits that the residuals do not show a significant pattern.From Figure 3 we can observe that the STACF of the residuals is significantly almost zero except for time lag 5 and 10, and spatial lag 2. The exception points at time lags 5 and 10, suggesting that the seasonal difference of order 5 might be useful for further model analysis.But the seasonal model analysis is not discussed here because it is out of the research scope.For forecasting purpose, the estimated parameters in Table 3 can be used to predict the h-step-ahead forecast value at time forecast origin T defined by

GSTAR(1;1) Modeling
GSTAR 1;1 model is the simplest model of GSTAR p; λ 1 , . . ., λ p model class defined in 2.1 because it is only characterized by the autoregressive terms lagged in time and spatial of order one.From 2.1 we can write GSTAR 1;1 as or in matrix notation The model has an interpretation that the current observation in a certain location only depends on the immediate past observations recorded at the location of interest and at the nearest locations.The GSTAR 1;1 model has 2N parameters since it is assumed that the parameter for each location is allowed to be unequal.For the per capita GDP data the least square estimator for GSTAR 1;1 model is presented in Table 4.The residuals distribution of the model, presented in Figure 5, give the conclusion that the residuals are approximately normally distributed with zero mean and the variance is nearly constant.In general, the STACF plot in Figure 6 shows that the residuals are significantly uncorrelated.The insignificant value was only found at the last time lag.
For the GDP data case, GSTAR 2;1,1 model has 64 parameters while GSTAR 1;1 has 32 parameters.Hence, it is not a surprise if the empirical MSE of GSTAR 2;1,1 is less than that of GSTAR 1;1 .However, though the number of parameters of GSTAR 1,1 is half of the other one, the empirical MSE of GSTAR 1,1 is only decreasing 0.257 compared to the GSTAR 2;1,1 .
The distribution of the MSE difference for each country is presented in a bubble plot Figure 7 .Center of the bubble is the value of MSE difference and the radius is its absolute value.The bubble placed under the zero axis indicates that the GSTAR 2,1,1 MSE for the associated country is smaller than that of the GSTAR 1;1 MSE.From the figure, we can see that the decrease of MSE values is mostly contributed by five countries, Belgium, Denmark, The bubble placed under the zero axis indicates that GSTAR 2,1,1 has a smaller value of MSE difference than that of the GSTAR 1;1 model.

Comparison of Forecast Performance
For the purpose of forecasting model, it would be useful if we also consider their forecast performance.Therefore, in this section we will examine the one-step-ahead forecasting performance for each model candidate using the last ten actual data points of the per capita GDP ratio data set.Result of this section is expected to become a supplementary reference in finding the most parsimony space-time model for the case of per capita GDP ratio.In 4.5 we have defined the h-step-ahead forecasting for GSTAR 2;1,1 model.For GSTAR 1;1 model, it can be estimated by where Φ 10 diag φ , and for h ≤ 0, Z T h Z T − h .The one-step-ahead forecast Z T j−1 1 of each data points can be calculated using 4.5 , 6.1 , and 6.2 for h 1.Note that calculation of the forecast value for each time j 1, 2, . . ., 10, is performed without updating the parameter estimates and average data.The one-step-ahead forecast for GSTAR 2;1,1 model is where j 1, 2, . . ., 10. Meanwhile the one-step-ahead forecast for GSTAR 1;1 model is To compare the forecast performance, we use mean of square forecast error MSFE which is defined by where and Z i,T j−1 1 is the element of Z i,T j−1 1 .
To measure the performance closeness between two models, we also calculate the MSFE difference between model M 1 and model M 2 for country i which is defined by 6.7 The performance of M 1 and M 2 model is said to be close if the value of C i is near to zero.The negative value of C i indicates that for location i, the M 1 model is better than M 2   model.Since the residual is approximately normally distributed, the paired t-test can also be applied to test the null hypothesis that model M 1 and model M 2 have the same forecast performance.
Suppose M 1 and M 2 in 6.7 are represented by GSTAR 2;1,1 and GSTAR 1;1 model, respectively.The MSFE difference values are displayed in a bubble plot Figure 8 .Center of the bubble is the value of MSFE difference and the radius is its absolute value.The bubble placed under the zero axis indicates that the MSFE value for model M 1 is smaller than that of the model M 2 .From the figure, it can be seen that most of the countries have the MSFE value which is scattered around the zero axis.The high value of the P value < 0.9858 for the paired t-test gives the evidence that the forecast performance of each model is not significantly different.
The behavior of both differences for each time is displayed in Figure 9. From the figure, it can be observed that most points in Ireland site 8 have a negative value.It means that in Ireland GSTAR 2;1,1 model has a better performance than GSTAR 1;1 .Conversely, the most points in Norway site 11 and Switzerland site 15 are positive when the difference is counted for the pair of GSTAR 2;1,1 and GSTAR 1;1 .On the other hand, Figure 7 also shows that in Norway there is a point with high jump difference value.It means that the worst performance of GSTAR 2;1,1 occurred at this point.In addition, the ratio of the negative value and the positive value is almost the same.It gives the interpretation that forecast performance of GSTAR 2;1,1 and GSTAR 1;1 model is almost similar.

Figure 1 :
Figure 1: a The ratio of per capita GDP in 16 West European countries for the period 1955-2006 and b plot of the centralized data of the difference data.

Figure 4 :
Figure 4: Histogram and normal probability plot of the GSTAR 2,1,1 residuals a and b and plot of residuals versus fitted value c .

Figure 7 :
Figure 7:Bubble plot for the difference value of empirical MSE between GSTAR 2;1,1 and GSTAR 1;1 .The bubble placed under the zero axis indicates that GSTAR 2,1,1 has a smaller value of MSE difference than that of the GSTAR 1;1 model.

1 Figure 8 :
Figure8: Bubble plot for the difference of mean of square forecast error MSFE between GSTAR 2;1,1 and GSTAR 1;1 .The bubble under the zero axis indicates that the GSTAR 2,1,1 has a smaller value of MSFE difference than that of the GSTAR 1;1 model.
. ..can be estimated by the least squares LS estimation.The procedure estimation and the asymptotic properties of LS estimators have been discussed extensively in Borovkova et al.1 .The following examples are an illustration on how to find the LS estimator for GSTAR 1;1 and GSTAR 2;1,1 models, respectively.

Table 1 :
The LS estimated values in average for the data generated from GSTAR 1;1 model with 1000 replications for various sample sizes T compared to the theoretical parameter.

Table 2 :
Geographical neighbourhood of order 1 and order 2.

Table 4 :
Least square estimates for GSTAR 1;1 parameters., Sweden, and Switzerland.Though the MSE value seems different, the paired t-test for the result gives the P values < 0.678.It means that the MSE values for both models are significantly indifferent. Ireland