Stochastic differential equation (SDE) is a very important mathematical tool to describe complex systems in which noise plays an important role. SDE models have been widely used to study the dynamic properties of various nonlinear systems in biology, engineering, finance, and economics, as well as physical sciences. Since a SDE can generate unlimited numbers of trajectories, it is difficult to estimate model parameters based on experimental observations which may represent only one trajectory of the stochastic model. Although substantial research efforts have been made to develop effective methods, it is still a challenge to infer unknown parameters in SDE models from observations that may have large variations. Using an interest rate model as a test problem, in this work we use the Bayesian inference and Markov Chain Monte Carlo method to estimate unknown parameters in SDE models.
1. Introduction
A stochastic differential equation (SDE) can be defined as a deterministic differential equation perturbed by random disturbances that are not necessarily small. SDEs have gained popularity in recent years for their ability to model systems that are subject to fluctuations and have been widely used in a variety of disciplines including engineering, environmetrics, physical sciences, population dynamics, biology, and medicine. In particular, SDEs are important instruments in modern finance theory and have been used to model the behavior of many key financial variables such as the instantaneous short-term interest rate, asset prices, asset returns, and their volatility. Consequently, the estimation of the parameters of SDEs from discretely-sampled data has received substantial attention in the financial econometrics literature, particularly in the last ten years [1, 2].
However, parameter estimation in nonlinear SDEs driven by Wiener processes, when only discrete observation is available, is an inherently difficult problem because theoretically an unlimited number of solutions exist for a SDE. Although parameter estimation in deterministic systems is a relatively well-studied subject (see, e.g., Beck and Arnold [3]), the estimation of the parameters of stochastic systems remains a challenge [4]. One of the reasons for this is that obtaining the solution of a set of SDEs is computationally demanding in the absence of a closed-form solution for most SDEs of practical importance. Numerical methods such as the Euler methods and the Taylor schemes combined with a Monte Carlo approach are required to calculate discrete-time trajectories of the state variables of SDEs [5]. These numerical methods require the generation of a large number of Wiener processes corresponding to different simulation trajectories, and hence accurate simulations are computationally expensive.
The methods that were developed for parameter estimation of SDEs can be classified into three different categories: maximum likelihood estimation (MLE)/simulated maximum likelihood (SML) [6–8], the methods of moments [9–11], and filtering (e.g., extended Kalman filter) [12]. Many of the methods have been developed in the context of financial modelling, where the systems of interest are characterised by long time horizons (of the order of months) and can often be sampled at regular but relatively infrequent intervals (e.g., on a daily basis). Among them, the maximum likelihood method is more reliable but has long been found to be difficult to apply to SDEs due to its computational cost. For many SDEs, thousands of simulation trajectories or even more must be generated to ensure a low variance of the values of state variables. As a result, a large number of competing estimation procedures have been proposed in recent years.
In recent years the Bayesian inference methods have been used to estimate unknown parameters in mathematical models [13–17]. Together with the Markov-chain Monte-Carlo (MCMC) and other methods, the Bayesian inference methods have also been used to infer stochastic models in financial mathematics [18–20]. The main advantage of these methods is the ability to infer the whole probability distribution of the parameters, rather than just a single estimate. In addition, the Bayesian methods can deal with noisy data and uncertain data. Another advantage of these methods is the capability to infer parameters in either deterministic models or stochastic models. However, the potential obstacle of these methods in application is that the samples are correlated and their performances heavily depend on prior hypotheses.
A number of methods have been used to estimate the parameters in the single-factor continuous time models, including the generalized moment method [21] and Gaussian estimation methods [22]. However, our recent research work suggested that the accuracy of the estimates generated from these two methods is low, in particular, when the stepsize of observation time points is not small [23]. Thus in this work we will not test these methods again but concentrate on the proposed method that will generate accurate simulations of the stochastic model, which will lead to more accurate estimates of the model parameters. Instead, utilizing the Bayesian inference and MCMC method, we develop a numerical algorithm to estimate unknown parameters in stochastic interest rate models. The remaining part of this paper is organized as follows. Section 2 gives the stochastic models for term structure of interest rates and numerical algorithms for simulating these stochastic models. Section 3 discusses the Bayesian inference and the MCMC method. Section 4 reports the numerical results for estimating parameters in the stochastic models for the term structure of interest rates.
2. Stochastic Model and Direct Simulation Methods
We first introduce the general form of SDEs for interest rates, namely,
(1)dX=f(t,X)dt+g(t,X)dW(t),
where f(t,X) is the drift term, g(t,X) is the diffusion term, and W(t) is the Wiener process whose increment follows the Gaussian distribution, namely,
(2)ΔWn=W(tn+1)-W(tn)~N(0,tn+1-tn).
Now we proceed to consider numerical methods for simulating the SDE. The widely used method in computational finance is the Euler-Maruyama method whose strong convergence order is just 0.5, given by
(3)Xn+1=Xn+hf(tn,Xn)+g(tn,Xn)ΔWn,
where Xn is the numerical solution at time point tn and h=tn+1-tn. Although this method is easy to implement, its stability property is not good enough to simulate SDEs with relatively large diffusion term. In order to obtain stable simulations, a very small stepsize is required, which may lead to large computing time. To improve the stability property, the semi-implicit and fully implicit Euler method can be used to reduce the computing time [24, 25]. For example, the semi-implicit Euler method is given by
(4)Xn+1=Xn+hf(tn+1,Xn+1)+g(tn,Xn)ΔWn.
Another approach is to use high-order methods in order to achieve more accurate simulations. The Milstein scheme uses a higher-order Taylor expansion and thus has a strong convergence order one. In the explicit Milstein method both the drift term and diffusion term are explicit, namely,
(5)Xn+1=Xn+hf(tn,Xn)+g(tn,Xn)ΔWn+12g(tn,yn)g′(tn,yn)((ΔWn)2-h).
Similarly, the semi-implicit and fully implicit Milstein methods have been designed to improve the stability property [24, 25].
In this paper we use a SDE model of the term structure of interest rate as the test system to examine the accuracy of inference methods [1]. The first stochastic model is the CIR model (Cox, Ingersoll, and Ross), which is a linear mean reversion model and uses a diffusion process [26]. This stochastic model has been widely used to model the short interest rate [21, 27]. The CIR model states that the short rate follows a square root diffusion process, which has the following continuous-time representation:
(6)dX=α(β-X)dt+σXdW(t),
where α is the speed of adjustment (or mean reversion), β represents the long run mean of the short-term interest rate, and σ is a constant volatility. Under this model, both the drift and volatility change with the level of the short rate.
In this work we will use the Euler-Maruyama method to generate samples of the interest rates (3). In fact, due to the linear feature of the drift term in the interest rate model, the semi-implicit method can be written in explicit form and can also be used in the Bayesian inference method. For the benchmark model, the Euler-Maruyama Scheme is
(7)Xn+1=Xn+α(β-Xn)h+σXnΔWn,
and the semi-implicit Euler scheme is given by
(8)Xn+1=11+αh(Xn+αβh+σXnΔWn).
3. Algorithm for Parameter Estimation
In this section, we establish a numerical algorithm for estimating parameters in stochastic models based on the Bayesian inference and MCMC method. By contrast with the classical approach that the unknown parameters in a model have fixed quantity, the unknown parameters of the underlying model in the Bayesian paradigm are treated as a random variable with some prior beliefs. The heart of the Bayesian approach is the Bayes theorem which allows us to compute the conditional probability density function of model parameters θ, assuming that the model parameters are continuous random variables, given the entire data set y:
(9)p(θ∣y)=p(y∣θ)p(θ)p(y).
Since the probability p(y) is independent of the model parameters, to maximize the joint probability density function, only the product p(y∣θ)p(θ) should be considered. Thus the posterior distribution f(θ∣y) can be interpreted as our prior beliefs of the parameters f(θ) updated by the current information from the data. Because we have little prior knowledge of θ, we may simply use a “noninformative” or “flat” prior.
In this work we use the Bayesian inference method derived by Joshi and Wilson [20] to infer the parameters in SDE models. It was assumed that the diffusion process {Xt} was observed at time points t0,t1,…,tn and the observation vector was Y={y0,y1,…,yn}. Since the closed form of the transition densities of the diffusion processes is usually not available, the transition densities can be approximated by the densities of a numerical scheme such as the widely used Euler-Maruyama method (3). However, the observation time stepsize Δt=ti+1-ti normally is quite large. To obtain more accurate approximation of the transition densities, a number of latent variables are introduced between every pair of consecutive observations:
(10)ti=τ0,i<τ1,i<⋯<τM,i=ti+1.
The stepsize of the latent variables δτ=τj+1,i-τj,i is small enough to ensure the accuracy and stability property of the Euler-Maruyama method. Then the transition density of the Euler scheme is
(11)PEuler{Xj+1,i∣Xj,i,Θ}=N(μEuler,σEuler2),
where j=0,1,…,M-1 and
(12)μEuler=Xj,i+f(tj,i,Xj,i,Θ)δτ,σEuler=g(tj,i,Xj,i,Θ)δτ.
Thus we have an inference problem with unknown parameter Θ using the latent variables X={Xj,i} for i=0,1,…,n, j=1,…,M-1, and the observation data Y.
In the proposed Gaussian Modified Bridge Approximation with importance sampling (GaMBA-I), the computing process is given as follows.
Algorithm 1.
Consider the following.
Step 1. Generate a sample of the unknown parameters Θ using the MCMC method or other methods.
Step 2. Sample the solution at the latent points X.
Step 3. Evaluate probability P(Y,X∣Θ).
Step 4. Evaluate probability P(X∣Y,Θ).
Step 5. Calculate probability
(13)P(Θ∣Y)∝P(Y,X∣Θ)P(Θ)P(X∣Y,Θ).
When the importance sampling technique is used, the above probability is determined by a number of samples rather than a single sample as indicated above.
Step 6. Accept or reject the parameter sample Θ using the MCMC method or other methods.
The major step in this Bayesian inference method is the evaluation of the probabilities of the generated samples for the latent variables. The probability P(Y,X∣Θ) is
(14)P(Y,X∣Θ)∝∏i=1nP(yi∣XM-1,i-1,Θ)∏i=1nP(X1,i-1∣yi-1,Θ)P(Y,X∣Θ)ff×∏i=1n∏j=1M-1P(Xj,i-1∣Xj-1,i-1,Θ).
Here we assume that the probability for the initial observation y0 is a constant. Each probability in the above expression can be approximated by the transition density of the Euler method, given by
(15)P(Y,X∣Θ)≈∏i=1nPEuler(yi∣XM-1,i-1,Θ)hhhhhhhhhh×∏i=1nPEuler(X1,i-1∣yi-1,Θ)hhhhhhhhhh×∏i=1n∏j=1M-1PEuler(Xj,i-1∣Xj-1,i-1,Θ),
where PEuler is the Euler density in (11).
For the probability P(X∣Y,Θ), we need to factorise it into
(16)P(X∣Y,Θ)=∏i=0n-1P(X(i)∣yi,yi+1,Θ)=∏i=0n-1P(X1,i,X2,i,…,XM-1,i∣yi,yi+1,Θ)=∏i=0n-1∏j=1M-1P(Xj,i∣Xj-1,i,XM,i,Θ),
where X0,i=yi and XM,i=yi+1. Using the Modified Brownian Bridge (MBB), the density of P(Xj∣Xj-1,XM,Θ) can be approximated by
(17)PMBB(Xj,i∣Xj-1,i,XM,i,Θ)≈NX(μMBB,σMBB2),
where
(18)μMBB=Xj-1,i+(XM,i-Xj-1,iτM,i-τj-1,i)δτ,σMBB=g(Xj-1,i,Θ)M-jM-j+1δτ.
Thus, the solution at the latent points is sampled by using
(19)Xj,i=μMBB+σMBBNj,
where Nj is a sample of the standard Gaussian random variable N(0,1).
When the importance sampling method is used, a number of samples of the latent variables are generated for Xk~PMBB(Xk∣Y,Θ) as described in Algorithm 1. Then we evaluate
(20)PGaMBA(Θ∣Y)∝1K∑k=1KPEuler(Y,Xk∣Θ)·P(Θ)PMBB(Xk∣Θ,Y).
To generate the samples of the unknown parameters Θj, a grid-sampling method was used by dividing the potential area by a regular grid [20]. Although this is an effective approach to infer mathematical models with a small number of unknown parameters, it is difficult to use it for dealing with models with a larger number of unknown parameters. In this work we use the MCMC method to search the optimal model parameters. Since the closed-form posterior distribution for a complex model often cannot be obtained analytically, the MCMC method has been widely used to achieve the posterior distribution by simulation. There are a number of efficient MCMC algorithms, including the Metropolis algorithm, the Metropolis-Hastings (MH) algorithm, and the Gibbs sampler method. In this work we use the MH algorithm to maximize the posterior distribution. The MH algorithm allows us to avoid the direct simulation from π(θ∣y) by making use of a proposal distribution and computing the acceptance probability for a candidate sample. There are a number of important issues that are related to the implementation of the MCMC. For example, the selected initial estimate has influence on the generated sequence; in particular, it has strong influence on the initial sequence of simulations. Thus an important technique is the burn-in technique. This technique is designed to reduce the influence of initial iteration on the generated Markov Chain by discarding the first iteration sequences. Generally we discard the first half of simulated sequences and keep the remaining half of sequence to obtain the target distribution. Certainly this technique is convenient but cannot be the most efficient one because about half of computing efforts are discarded. Although more specific methods have been designed to analyze the simulation output according to the dependence of simulation on the starting values [28], we typically go with the simple burn-in approach and accept the increased Monte Carlo error involved in discarding half of simulations. To design a strategy to finish the computation, we normally monitor the convergence of all the parameters and other quantities of interest separately. Our usual approach is, for each parameter, to calculate the variance of simulations from each chain (after the first half of the chain was discarded using the burn-in technique). Assuming we have J chains from different initial estimate and the length of each chain is G, let θij be the jth estimate in the sequence for parameter θ; the variance inside the chain is
(21)W=1J(G-1)∑j=1J∑g=1G(θij-θ¯j)2
and the variance between the different chains is
(22)B=1(J-1)G∑j=1J(∑g=1Gθgj-1J∑j=1J∑g=1Gθgj)2.
Based on these values, we can calculate the value of R as
(23)R=1G(G-1+BW).
The value of R is always greater than or equal to 1. When the variance inside the chain approaches the variance between the chains, the value of R approaches 1. We can accept the chain as convergent when R<1.2 [29]. Another important technique is thinning by rejecting certain part of the chain. If Θt is the current candidate of the model parameters and Θ* is the newly generated one, let
(24)α=1∧P(Θ*∣y)P(Θt∣y).
Generate a sample r~U(0,1), and set Θt+1=Θ* if r<α. Otherwise set Θt+1=Θt.
4. Numerical Results
In this section, we used the numerical algorithm based on the Bayesian inference and MCMC method to estimate the parameters in the CIR model (6). Figure 1 gives 5 simulations of the CIR model with parameters α=0.2, β=0.08, and σ=0.1. When the volatility is not large, it shows that the values of short interest rate maintain positivity, which is unlike Vasicek model may lead to negative values of interest rate. We used stepsize h=0.05 in the numerical simulation to ensure the accuracy and stability property of simulations.
Five simulations of the CIR model with α=0.2, β=0.08 and σ=0.1.
The estimated values of the parameters in (6) for the mean-reverting test system and their standard deviations are given in Table 1, and more detailed simulation results of the Bayesian inference method are presented in Figure 2. In this test the size of the importance sampling is K=50. For each parameter, we presented the time series of the parameter values, the cumulative means, and the histogram distribution. Compared with the exact values (α,β,σ)=(0.2,0.08,0.05), the Bayesian inference method gives estimates with good accuracy. For parameters β and σ, the histogram distributions are consistent with the cumulative means of the estimates. However, the histogram of parameter α is not symmetrical to the cumulative means.
Estimated parameters and their standard errors.
Parameters
α(0.2)
β(0.08)
σ(0.05)
Estimated values
0.1803
0.0792
0.0442
Standard deviations
0.0109
0.0024
0.0013
Simulation output for parameters α (the first row), β (the second row), and σ (the bottom row). Left column: the time series of the parameter values; middle column: the cumulative means of each parameter; right column: histogram distribution.
In this study we tested the influence of the sample size in the importance sampling on the accuracy of the estimates. The sampling size was chosen as K=1,10,25,50,100,200,500. Numerical results in Figure 3 show that the sampling size is important to improve the accuracy of the estimates but a larger sampling size does not necessary lead to much better accuracy, though numerical results in Figure 3 suggest that increasing sampling size can improve the accuracy slightly. Thus a reasonable size of the importance sampling is sufficient to generate estimates with adequate accuracy. This may be the reason that the sampling size is not very large in the previous studies [20].
Estimated model parameters using different values of the importance sampling sizes. (a) The estimated model parameters; (b) the standard deviation (std) of the estimates. The importance sampling size is K=1,10,25,50,100,200,500 when index=1~7.
We have also tested the influence of different samples of latent variables on the variation of estimates. In this test the simulated observations Y are kept unchanged. Figure 4 shows that the difference of sampling has certain influence on the variations of the estimates. The estimated model parameters vary in different tests. However, the variations in both the averaged parameter values and standard deviation are not large, which is consistent with the numerical results using the particle swarm optimization method to estimate model parameters. However, the variations of estimates are smaller than those obtained by using the genetic algorithm [30, 31].
Variations of the estimated parameters when the sampling of the latent variables is different. (a) The estimated model parameters; (b) the standard deviation (std) of the estimates.
5. Conclusions
This work presents an effective algorithm for the estimation of parameters in SDE models. The proposed approach is based on the implementation of the Bayesian inference and the MCMC method. Compared with the grid method, the MCMC based method can be used to infer stochastic models with a large number of unknown parameters. This method has been applied to an important stochastic model of the term structure of interest rate, which is a fundamental issue in the research area of financial mathematics. In addition, the importance sampling technique was used to increase the robustness of estimates. We have also examined the influence of different samples of the latent variables on the variation of estimates. Numerical results suggested that the method used in this work is robust to such variation.
In this work, we introduced the Gaussian Modified Bridge Approximation into the MCMC and examine the accuracy and robustness of this approach. It is worthy to note that the performance of the MCMC is related to a number of important factors, such as convergence criteria, burn-in technique, and thinning to reduce autocorrelation; thus, further efforts are needed to discuss these issues. In addition, it is still a challenging problem for estimating parameters in stiff stochastic models. In the method used in this work, the explicit Euler method was used in simulating stochastic models. Thus, a large number of the latent variables are needed because the stepsize of the numerical method must be very small. However, in this case, we still have difficulties to properly evaluate the complete likelihood P(Y,X∣Θ) and conditional distribution P(X∣Y,Θ) due to the product of a large number of probabilities. Alternatively we may consider the implicit methods or high-order methods rather than the explicit Euler method. Thus, more effective calibration methods should be designed for estimating parameters in stiff SDEs.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research work is supported by the Australian Research Council (ARC) (DP120104460, FT100100748, and DP1094181) and the Chinese National Social Science Foundation (SSF) (10BJY104).
HurnA. S.JeismanJ. I.LindsayK. A.Seeing the wood for the trees: a critical evaluation of methods to estimate the parameters of stochastic differential equationsde RossiG.Maximum likelihood estimation of the Cox-Ingersoll-Ross model using particle filtersBeckJ. V.ArnoldK. J.TimmerJ.Parameter estimation in nonlinear stochastic differential equationsBurrageK.BurrageP. M.TianT.Numerical methods for strong solutions of stochastic differential equations: an overviewDurhamG. B.GallantA. R.Numerical techniques for maximum likelihood estimation of continuous-time diffusion processesFlorens-ZmirouD.Approximate discrete-time schemes for statistics of diffusion processesHurnA. S.LindsayK. A.Estimating the parameters of stochastic differential equationsGallantA. R.TauchenG.The relative efficiency of method of moments estimatorsPereira LóB.HaslamA. J.AdjimanC. S.An algorithm for the estimation of parameters in models with stochastic differential equationsScottD. W.NielsenJ. N.MadsenH.Applying the EKF to stochastic differential equations with level effectsBarndorff-NielsenO. E.ShephardN.Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economicsBoysR. J.WilkinsonD. J.KirkwoodT. B. L.Bayesian inference for a discretely observed stochastic kinetic modelKomorowskiM.FinkenstädtB.HarperC. V.RandD. A.Bayesian inference of biochemical kinetic parameters using the linear noise approximationRogersS.KhaninR.GirolamiM.Bayesian model-based inference of transcription factor activityWilkinsonD. J.Bayesian methods in bioinformatics and computational systems biologyGrayP.Bayesian estimation of short-rate modelsSanfordA. D.MartinG. M.Simulation-based Bayesian estimation of an affine term structure modelJoshiC.WilsonS.Grid based Bayesian inference for stochastic dif-ferential equation modelsTechnical Paper, Trinity College Dublin, 2012, https://www.scss.tcd.ie/disciplines/statistics/tech-reports/11-07.pdfChanK. C.KarolyiG. A.LongstaffF. A.SandersA. B.An empirical comparison of alternative models of the short-term interest rateNowmanK. B.Gaussian estimation of single-factor continuous time models of the term structure of interest ratesTianT.GeX.Calibration of stochastic differential equation models using implicit numerical methods and particle swarm optimizationProceedings of the International Conference on Modelling, Identification and Control (ICMIC '12)2012IEEE Press10491054KloedenP. E.PlatenE.TianT.BurrageK.Implicit Taylor methods for stiff stochastic differential equationsCoxJ. C.Ingersoll,J. E.Jr.RossS. A.A theory of the term structure of interest ratesJonesC. S.Nonlinear mean reversion in the short-term interest rateLiuC.RubinD. B.Model-based analysis to improve the performance of iterative simulationsGelmanA.CarlinJ. B.SternH. S.RubinD. B.TianT.Estimation of kinetic rates of MAP kinase activation from experimental dataProceedings of the International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing (IJCBS '09)August 2009Shanghai, ChinaIEEE Press4574622-s2.0-7045016356410.1109/IJCBS.2009.78TianT.XuS.GaoJ.BurrageK.Simulated maximum likelihood method for estimating kinetic rates in gene expression