Posterior Analysis of State Space Model with Spherical Symmetricity

The present work investigates state space model with nonnormal disturbances when the deviation from normality has been observed only with respect to kurtosis and the distribution of disturbances continues to follow a symmetric family of distributions. Spherically symmetric distribution is used to approximate behavior of symmetric nonnormal disturbances for discrete time series. The conditional posterior densities of the involved parameters are derived, which are further utilized in Gibbs sampler scheme for estimating the marginal posterior densities. The state space model with disturbances following multivariate-t distribution, which is a particular case of spherically symmetric distribution, is discussed.


Introduction
Often in the analysis of various time series dynamic or state space models, the distribution of disturbances is assumed to be normal.However, the assumption of normal disturbances may not be satisfied in many practical applications of these models.For example, deviations from normality have been observed in stock price data where behavior of disturbances appears to be leptokurtic; see, for example, Kendall [1] and Fama [2].Praetz [3] suggested scaled- distribution to explain leptokurtic behavior of stock price data.If the tails of the distribution are flatter than the normal distribution, the multivariate- distribution may provide a more realistic model for the disturbances.Zellner [4] revealed that dependent but uncorrelated responses could be analyzed using a student- model.Fraser [5] demonstrated the robustness of the student- family as opposed to the normal distribution based on empirical studies and further suggested that a student- model holds equally good for normal model responses.Another advantage of student- family is that it handles outliers easily; see Sutradhar and Ali [6].Student- model is also advocated by Haq and Khan [7], Khan and Haq [8], and Chib et al. [9].Prucha and Kelejian [10] observed that normal model based analysis is heavily influenced by extreme observations and deviations from assumptions and ignores sample information beyond the first two moments.Robustness for Bayes prediction under multivariate- case has also been studied by Chib et al. [11], which is a particular member of the spherically symmetric family.Jammalamadaka et al. [12] have, however, shown that predictive inferences are completely uninfluenced by departure from the normality assumption in the direction of the spherically symmetric family.Properties of the spherical distribution have been studied by Kelker [13] and Chmielewski [14].In carrying out the Bayesian analysis of various dynamic models involving multidimensional parameter vector, it is often difficult to derive explicit expressions for the marginal posterior distributions of various parameters.In such circumstances, Markov Chain Monte Carlo (MCMC) technique, also known as Gibbs sampler, provides a useful way for estimating the marginal posterior densities.Klock and Van Dijk [15], Naylor and Smith [16], and Geweke [17][18][19] applied numerical MCMC for the Bayesian analysis of various time series models.Tiwari et al. [20] considered a general state space or linear dynamic model with normal disturbances and carried out the Bayesian analysis of the model using Gibbs sampler.In the present exposition, the assumption of normality of disturbances is relaxed and it is assumed that the disturbances follow distribution belonging to the family of spherically symmetric distributions.

Model and Assumptions
The following form of general state space (or linear dynamic model) is considered: where   is the observation on the response variable,   is a ×1 regression vector,   is a ×1 vector of parameters generated by the first-order autoregressive model in (1), and   is a known  ×  transition matrix.The initial value of parameter at  = 0, that is,  0 , is assumed to be unknown with its uncertainty expressed by the following normal distribution  0 |  ∼   ( 0 ,  −1 Σ 0 ),  ∼ ( 0 /2,  0 /2), where ( 0 /2,  0 /2) denotes a Gamma distribution with density function defined by () = (( 0 /2)  0 /2 /Γ( 0 /2))  0 /2−1  −( 0 /2) and Σ 0 is a  ×  positive definite, symmetric matrix.The hyperparameters  0 ,  0 ,  0 , and Σ 0 are assumed to be known.The errors   and ]  are assumed to be independent for all  and .The joint distribution of  = ( 1 , . . .,   )  is assumed to be spherically symmetric distribution with joint density function given by where Ψ() is a positive measurable function and pdf and cdf of  are represented by () and (), respectively.We further assume that the errors ]  follow the normal distribution with mean 0 and variance  −1 Σ, where Σ is a  ×  positive definite symmetric known matrix.Thus,

Conditional Posterior Densities
The density function of the observation vector  = ( 1 , . . .,   )  is given by In other words, the conditional distribution of  given  is normal with mean (  1  1 , . . .,      )  and covariance matrix ( 2 0 /) 2 ()  .Since it is difficult to derive the explicit expressions for the marginal posterior densities of the parameters of the model, Gibbs sampler scheme can be used for estimating the marginal posterior densities.The conditional posterior densities of different parameters derived in this section can be utilised in Gibbs sampler scheme.The following joint distribution of ( ̃, {  }  =0 , ) given  is obtained as follows: Let us write, Theorem 1.The conditional posterior density of   is given by Proof.For  = 0, using expression (4), we get or where  0 () is the normalizing constant.For obtaining the value of  0 (), we have for  = , where  = 1, 2, . . .,  − 1, and using expression (4) we get where the normalizing constant   () is obtained as for  = , and utilizing expression (4), we get or where   () is the normalizing constant given by Remark 2. It is interesting to observe from the above theorems that the conditional posterior density of  0 given ({  }  =1 , ) depends only upon ( 1 , ) and the conditional posterior density of   ( = 1, . . .,  − 1) given ({  }  =0 ( ̸ = ), ) depends only upon ( −1 ,  +1 , ) whereas the conditional posterior density of   given ({  } −1 =0 , ) depends only upon ( −1 , ).Therefore, the conditional posterior densities for   ( = 0, 1, . . ., ) depend upon (one or two) values of   adjacent to   and .Now, we define Theorem 3. The conditional posterior of  is obtained as Proof.Following (4), we observe that The normalizing constant is given by Remark 4. The above derived expressions for the conditional posterior densities of  0 ,   ( = 1, . . .,  − 1),   , and  involve integration with respect to ().Sometimes, it may not be possible to obtain these integrals in neat form.In such situation, we consider  as a parameter with prior density function () and derive the conditional posterior density functions of   given ( () , , ),  given ({  }  =0 , ), and  given ({  }  =0 , ).The posterior conditional density functions of   and  can be straightaway obtained as follows: Hence, the conditional posterior density of   is normal   (  ()  (),  −1   ()) and conditional posterior density of  is gamma.
Theorem 5.The conditional posterior density of  given ({  }  =0 , ) is given by Proof.The joint density function of ( ̃, {  }  =0 , , ) is given by Hence, the conditional posterior density function of  is given as
For multivariate- distribution, Further, the conditional posterior density of   given { () , , } is   (  ()  (),  −1   ()) and conditional posterior density of  is gamma.Further, Hence, the conditional posterior density of  is given by
Recall that the estimated posterior density of   depends on the two values adjacent to   .Hence, an estimate of  | is the mean of the estimated density obtained as , where  is the number of iterations during implementation of the Gibbs sampler and  ()   and  ()  are the values of () and () based on ({ ()   }  =1 ,  =  − 1,  + 1,  ()  ).Then the fitted value of   for our model is ỹ =     | .If one is interested in one step ahead prediction of the unknown value  +1 , the Gibbs algorithm can be employed with the following modification.We consider { +1 ,  +1 } as an additional parameter to the Gibbs sampler.The conditional density of  +1 is given by Then Gibbs sampler can be run to find the following estimates of the marginal densities of  +1 and predictive density of  +1 , respectively: Under the squared error loss,  +1 can be predicted by the mean of the estimate of predictive density f( +1 |  ̃), which is given by Alternatively, one step ahead prediction via Gibbs sampler { () +1, }  =1 is given by  +1 = (1/) ∑  =1  () +1, .The above procedure can be adopted for predicting  + ;  ≥ 1.

Empirical Analysis
Stock price changes over a 3-month period are studied by analyzing records from database (of yahoo finance) documenting stocks of Amazon.com-anelectronic commerce company.The data consists of daily latest intraday delayed opening prices, quoted in US dollars, taken between 6 July, 2015, and 2 October, 2015.The early diagnostics reflect that the data follows the random walk plus noise model which is an elementary form of state space model involving just one state variable.Specification of    and   in the model and knowledge of variances and covariances of the disturbance term are a prerequisite for implementing state space models.Based on the external facts and a preliminary examination of the data, we assumed that  = 1,  0 = 1,  0 = 2,  0 = 2,  0 = 1 (a restrictive assumption), Σ 0 = Σ = 1 (restrictive assumption),  ∼  2 (1),    = 1,   = 1.Analysis for model fitting is based on the first 60 observations.Subsequent 5 observations are excluded from the analysis and are used further in the predictive analysis.We then executed the Gibbs sampler for  = 40 iterations, using  = 1500 parallel replications per iteration.A few initial iteration values in the chain were discarded to allow for dissipation of the initialized values called the burn-in values.The Kalman filter is fitted using ŷ =    ((1/) ∑  =1  () , ).The longitudinal profile of the original 60 data points superimposed with the graph of Kalman filter model (via Gibbs sampler) is plotted in Figure 1.
For implementing one step ahead prediction of the observations   ,  = 1, . . ., 5, we added { + ,  + } to the Gibbs sampler as described in Section 5. Sample values were generated for  + ,  = 1, . . ., 5, for several parallel chains. = 1500 replications were conducted in order to obtain smoother estimates.It is observed that the first predicted value is closest to the true value (Table 1).This may be attributed to the fact that the Kalman filter for prediction is based only on the last observed value.Subsequent chains are based on the generated predictions which are subsequently treated as observations.It appears that Gibbs sampler provides a good one step ahead prediction only.The impediment may be overcome by using more efficient computing machines which allow for longer iterative chains and higher quantum of replications.The increased simulation would allow to assess the required length of path which the chain traverses before stabilization occurs for predicting future values and to identify whether more distant predictions could be improved by increasing the number of iterations and/or the length of the chains, since it is known that the predictive power of Gibbs sampler is limited by the choices of the number of iterations, number of replications (parallel chains), and the initial values of the chains [21].However, the present empirical analysis demonstrates the ability of the Gibbs sampler to provide a good fit to the Kalman filter model for the stock price data modeled using the state space model assuming a single state although the Gibbs algorithm fails to provide satisfactory predictions of the future observations for the volatile stock prices data.

Conclusion
The present work focuses on the discrete time formulation for the state space models, which are a particular form of stochastic coefficients models and have inherent facility for adapting to model departures through state equations.The objective of developing various time series or econometric models is to evaluate the probability mechanism generating a time series or a set of time series.To understand the global aggregates or the macroquantities such as GNP, CPI, level of employment, production in industry, price or volume of a crop, the amount and types of organic pollutants, or functioning of any other system, the need is to build a theoretical model in order to enable the practitioner to take action or to test the theories generating the observations.Development of the present theoretic framework from the Bayesian perspective is expected to provide a new insight and an enhanced dimension in predictive analysis for applied researchers and practitioners.

Figure 1 :
Figure 1: Opening price data (represented by hollow circles) and data fitted through Kalman filter model (represented by * ).Original data points are joined by straight lines.

Table 1 :
Prediction of the opening stock prices up till 5 steps ahead of the recorded data.