A class of martingale estimating functions is convenient and plays an important role for inference for nonlinear time series models. However, when the information about the first four conditional moments
of the observed process becomes available, the quadratic estimating functions are more informative. In this paper, a general framework for joint estimation of conditional mean and variance parameters in time series models using quadratic estimating functions is developed. Superiority of the approach is demonstrated by comparing the information associated with the optimal quadratic estimating function with the information associated with other estimating functions. The method is used to study the optimal quadratic estimating functions of the parameters of autoregressive conditional duration (ACD) models, random coefficient autoregressive (RCA) models, doubly stochastic models and regression models with ARCH errors. Closed-form expressions for the information gain are also discussed in some detail.
1. Introduction
Godambe [1] was the first to study the inference for discrete time stochastic processes using estimating function method. Thavaneswaran and Abraham [2] had studied the nonlinear time series estimation problems using linear estimating functions. Naik-Nimbalkar and Rajashi [3] and Thavaneswaran and Heyde [4] studied the filtering and prediction problems using linear estimating functions in the Bayesian context. Chandra and Taniguchi [5], Merkouris [6], and Ghahramani and Thavaneswaran [7] among others have studied the estimation problems using estimating functions. In this paper, we study the linear and quadratic martingale estimating functions and show that the quadratic estimating functions are more informative when the conditional mean and variance of the observed process depend on the same parameter of interest.
This paper is organized as follows. The rest of Section 1 presents the basics of estimating functions and information associated with estimating functions. Section 2 presents the general model for the multiparameter case and the form of the optimal quadratic estimating function. In Section 3, the theory is applied to four different models.
Suppose that {yt,t=1,…,n} is a realization of a discrete time stochastic process, and its distribution depends on a vector parameter θ belonging to an open subset Θ of the p-dimensional Euclidean space. Let (Ω,ℱ,Pθ) denote the underlying probability space, and let ℱty be the σ-field generated by {y1,…,yt,t≥1}. Let ht=ht(y1,…,yt,θ), 1≤t≤n be specified q-dimensional vectors that are martingales. We consider the class ℳ of zero mean and square integrable p-dimensional martingale estimating functions of the formM={gn(θ):gn(θ)=∑t=1nat-1ht},
where at-1 are p×q matrices depending on y1,…,yt-1, 1≤t≤n. The estimating functions gn(θ) are further assumed to be almost surely differentiable with respect to the components of θ and such that E[(∂gn(θ)/∂θ)∣ℱn-1y] and E[gn(θ)gn(θ)′∣ℱn-1y] are nonsingular for all θ∈Θ and for each n≥1. The expectations are always taken with respect to Pθ. Estimators of θ can be obtained by solving the estimating equation gn(θ)=0. Furthermore, the p×p matrix E[gn(θ)gn(θ)′∣ℱn-1y] is assumed to be positive definite for all θ∈Θ. Then, in the class of all zero mean and square integrable martingale estimating functions ℳ, the optimal estimating function gn*(θ) which maximizes, in the partial order of nonnegative definite matrices, the information matrixIgn(θ)=(E[∂gn(θ)∂θ∣Fn-1y])′(E[gn(θ)gn(θ)′∣Fn-1y])-1(E[∂gn(θ)∂θ∣Fn-1y])
is given bygn*(θ)=∑t=1nat-1*ht=∑t=1n(E[∂ht∂θ∣Ft-1y])′(E[htht′∣Ft-1y])-1ht,
and the corresponding optimal information reduces to E[gn*(θ)gn*(θ)′∣ℱn-1y].
The function gn*(θ) is also called the “quasi-score” and has properties similar to those of a score function in the sense that E[gn*(θ)]=0 and E[gn*(θ)gn*(θ)′]=-E[∂gn*(θ)/∂θ′]. This is a more general result in the sense that for its validity, we do not need to assume that the true underlying distribution belongs to the exponential family of distributions. The maximum correlation between the optimal estimating function and the true unknown score justifies the terminology “quasi-score” for gn*(θ). Moreover, it follows from Lindsay [8, page 916] that if we solve an unbiased estimating equation gn(θ)=0 to get an estimator, then the asymptotic variance of the resulting estimator is the inverse of the information Ign. Hence, the estimator obtained from a more informative estimating equation is asymptotically more efficient.
2. General Model and Method
Consider a discrete time stochastic process {yt,t=1,2,…} with conditional momentsμt(θ)=E[yt∣Ft-1y],σt2(θ)=Var(yt∣Ft-1y),γt(θ)=1σt3(θ)E[(yt-μt(θ))3∣Ft-1y],κt(θ)=1σt4(θ)E[(yt-μt(θ))4∣Ft-1y]-3.
That is, we assume that the skewness and the excess kurtosis of the standardized variable yt do not contain any additional parameters. In order to estimate the parameter θ based on the observations y1,…,yn, we consider two classes of martingale differences {mt(θ)=yt-μt(θ),t=1,…,n} and {st(θ)=mt2(θ)-σt2(θ),t=1,…,n} such that〈m〉t=E[mt2∣Ft-1y]=E[(yt-μt)2∣Ft-1y]=σt2,〈s〉t=E[st2∣Ft-1y]=E[(yt-μt)4+σt4-2σt2(yt-μt)2∣Ft-1y]=σt4(κt+2),〈m,s〉t=E[mtst∣Ft-1y]=E[(yt-μt)3-σt2(yt-μt)∣Ft-1y]=σt3γt.
The optimal estimating functions based on the martingale differences mt and st are gM*(θ)=-∑t=1n(∂μt/∂θ)(mt/〈m〉t) and gS*(θ)=-∑t=1n(∂σt2/∂θ)(st/〈s〉t), respectively. Then, the information associated with gM*(θ) and gS*(θ) are IgM*(θ)=∑t=1n(∂μt/∂θ)(∂μt/∂θ′)(1/〈m〉t) and IgS*(θ)=∑t=1n(∂σt2/∂θ)(∂σt2/∂θ′)(1/〈s〉t), respectively. Crowder [9] studied the optimal quadratic estimating function with independent observations. For the discrete time stochastic process {yt}, the following theorem provides optimality of the quadratic estimating function for the multiparameter case.
Theorem 2.1.
For the general model in (2.1), in the class of all quadratic estimating functions of the form 𝒢Q={gQ(θ):gQ(θ)=∑t=1n(at-1mt+bt-1st)},
the optimal estimating function is given by gQ*(θ)=∑t=1n(at-1*mt+bt-1*st), where
We choose two orthogonal martingale differences mt and ψt=st-σtγtmt, where the conditional variance of ψt is given by 〈ψ〉t=(〈m〉t〈s〉t-〈m,s〉t2)/〈m〉t=σt4(κt+2-γt2). That is, mt and ψt are uncorrelated with conditional variance 〈m〉t and 〈ψ〉t, respectively. Moreover, the optimal martingale estimating function and associated information based on the martingale differences ψt are
gΨ*(θ)=∑t=1n(∂μt∂θ〈m,s〉t〈m〉t-∂σt2∂θ)ψt〈ψ〉t=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1×((-∂μt∂θ〈m,s〉t2〈m〉t2〈s〉t+∂σt2∂θ〈m,s〉t〈m〉t〈s〉t)mt+(∂μt∂θ〈m,s〉t〈m〉t〈s〉t-∂σt2∂θ1〈s〉t)st),IgΨ*(θ)=∑t=1n(∂μt∂θ〈m,s〉t〈m〉t-∂σt2∂θ)(∂μt∂θ′〈m,s〉t〈m〉t-∂σt2∂θ′)1〈ψ〉t=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1×(∂μt∂θ∂μt∂θ′〈m,s〉t2〈m〉t2〈s〉t+∂σt2∂θ∂σt2∂θ′1〈s〉t-(∂μt∂θ∂σt2∂θ′+∂σt2∂θ∂μt∂θ′)〈m,s〉t〈m〉t〈s〉t).
Then, the quadratic estimating function based on mt and ψt becomes
gQ*(θ)=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1×((-∂μt∂θ1〈m〉t+∂σt2∂θ〈m,s〉t〈m〉t〈s〉t)mt+(∂μt∂θ〈m,s〉t〈m〉t〈s〉t-∂σt2∂θ1〈s〉t)st)
and satisfies the sufficient condition for optimality
E[∂gQ(θ)∂θ∣Ft-1y]=Cov(gQ(θ),gQ*(θ)∣Ft-1y)K,∀gQ(θ)∈GQ,
where K is a constant matrix. Hence, gQ*(θ) is optimal in the class 𝒢Q, and part (a) follows. Since mt and ψt are orthogonal, the information IgQ*(θ)=IgM*(θ)+IgΨ*(θ) and part (b) follow. Hence, for each component θi, i=1,…,p, neither gM*(θi) nor gS*(θ) is fully informative, that is, IgQ*(θi)≥IgM*(θi) and IgQ*(θi)≥IgS*(θi).
Corollary 2.2.
When the conditional skewness γ and kurtosis κ are constants, the optimal quadratic estimating function and associated information, based on the martingale differences mt=yt-μt and st=mt2-σt2, are given by
gQ*(θ)=(1-γ2κ+2)-1∑t=1n1σt3((-σt∂μt∂θ+γκ+2∂σt2∂θ)mt+1κ+2(γ∂μt∂θ-1σt∂σt2∂θ)st),IgQ*(θ)=(1-γ2κ+2)-1(IgM*(θ)+IgS*(θ)-γκ+2∑t=1n1σt3(∂μt∂θ∂σt2∂θ′+∂σt2∂θ∂μt∂θ′)).
There is growing interest in the analysis of intraday financial data such as transaction and quote data. Such data have increasingly been made available by many stock exchanges. Unlike closing prices which are measured daily, monthly, or yearly, intra-day data or high-frequency data tend to be irregularly spaced. Furthermore, the durations between events themselves are random variables. The autoregressive conditional duration (ACD) process due to Engle and Russell [10] had been proposed to model such durations, in order to study the dynamic structure of the adjusted durations xi, with xi=ti-ti-1, where ti is the time of the ith transaction. The crucial assumption underlying the ACD model is that the time dependence is described by a function ψi, where ψi is the conditional expectation of the adjusted duration between the (i-1)th and the ith trades. The basic ACD model is defined as xi=ψiɛi,ψi=E[xi∣Fti-1x],
where ɛi are the iid nonnegative random variables with density function f(·) and unit mean, and ℱti-1x is the information available at the (i-1)th trade. We also assume that ɛi is independent of ℱt-1x. It is clear that the types of ACD models vary according to different distributions of ɛi and specifications of ψi. In this paper, we will discuss a specific class of models which is known as ACD (p, q) model and given by xt=ψtɛt,ψt=ω+∑j=1pajxt-j+∑j=1qbjψt-j,
where ω>0, aj>0, bj>0, and ∑j=1max(p,q)(aj+bj)<1. We assume that ɛt's are iid nonnegative random variables with mean μɛ, variance σɛ2, skewness γɛ, and excess kurtosis κɛ. In order to estimate the parameter vector θ=(ω,a1,…,ap,b1,…,bq)′, we use the estimating function approach. For this model, the conditional moments are μt=μɛψt, σt2=σɛ2ψt2, γt=γɛ, and κt=κɛ. Let mt=xt-μt and st=mt2-σt2 be the sequences of martingale differences such that 〈m〉t=σɛ2ψt2, 〈s〉t=σɛ4(κɛ+2)ψt4, and 〈m,s〉t=σɛ3γɛψt3. The optimal estimating function and associated information based on mt are given by gM*(θ)=-(μɛ/σɛ2)∑t=1n(1/ψt2)(∂ψt/∂θ)mt and IgM*(θ)=(μɛ2/σɛ2)∑t=1n(1/ψt2)(∂ψt/∂θ)(∂ψt/∂θ′). The optimal estimating function and the associated information based on st are given by gS*(θ)=-2/σɛ2(κɛ+2)∑t=1n(1/ψt3)(∂ψt/∂θ)st and IgS*(θ)=(4/(κɛ+2))∑t=1n(1/ψt2)(∂ψt/∂θ)(∂ψt/∂θ′). Then, by Corollary 2.2 that the optimal quadratic estimating function and associated information are given by gQ*(θ)=1σɛ2(κɛ+2-γɛ2)∑t=1n(-μɛ(κɛ+2)+2σɛγɛψt2∂ψt∂θmt+μɛγɛ-2σɛψtσɛψt3∂ψt∂θst),IgQ*(θ)=(1-γɛ2κɛ+2)-1(IgM*(θ)+IgS*(θ)-4μɛγɛσɛ(κɛ+2)∑t=1n1ψt2∂ψt∂θ∂ψt∂θ′)=4σɛ2+μɛ2(κɛ+2)-4μɛσɛγɛσɛ2(κɛ+2-γɛ2)∑t=1n1ψt2∂ψt∂θ∂ψt∂θ′,
the information gain in using gQ*(θ) over gM*(θ) is(2σɛ-μɛγɛ)2σɛ2(κɛ+2-γɛ2)∑t=1n1ψt2∂ψt∂θ∂ψt∂θ′,
and the information gain in using gQ*(θ) over gS*(θ) is(μɛ(κɛ+2)-2σɛγɛ)2σɛ2(κɛ+2-γɛ2)(κɛ+2)∑t=1n1ψt2∂ψt∂θ∂ψt∂θ′,
which are both nonnegative definite.
When ɛt follows an exponential distribution, μɛ=1/λ, σɛ2=1/λ2, γɛ=2, and κɛ=3. Then, IgM*(θ)=∑t=1n(1/ψt2)(∂ψt/∂θ)(∂ψt/∂θ′), IgS*(θ)=(4/5)∑t=1n(1/ψt2)(∂ψt/∂θ)(∂ψt/∂θ′), and IgQ*(θ)=∑t=1n(1/ψt2)(∂ψt/∂θ)(∂ψt/∂θ′), and hence IgQ*(θ)=IgM*(θ)>IgS*(θ).
3.2. Random Coefficient Autoregressive Models
In this section, we will investigate the properties of the quadratic estimating functions for the random coefficient autoregressive (RCA) time series which were first introduced by Nicholls and Quinn [11].
Consider the RCA modelyt=(θ+bt)yt-1+ɛt,
where {bt} and {ɛt} are uncorrelated zero mean processes with unknown variance σb2 and variance σɛ2=σɛ2(θ) with unknown parameter θ, respectively. Further, we denote the skewness and excess kurtosis of {bt} by γb, κb which are known, and of {ɛt} by γɛ(θ), κɛ(θ), respectively. In the model (3.6), both the parameter θ and β=σb2 need to be estimated. Let θ=(θ,β)′, we will discuss the joint estimation of θ and β. In this model, the conditional mean is μt=yt-1θ then and the conditional variance is σt2=yt-12β+σɛ2(θ). The parameter θ appears simultaneously in the mean and variance. Let mt=yt-μt and st=mt2-σt2 such that 〈m〉t=yt-12σb2+σɛ2, 〈s〉t=yt-14σb4(κb+2)+σɛ4(κɛ+2)+4yt-12σb2σɛ2, 〈m,s〉t=yt-13σb3γb+σɛ3γɛ. Then the conditional skewness is γt=〈m,s〉t/σt3, and the conditional excess kurtosis is κt=〈s〉t/σt4-2.
Since ∂μt/∂θ=(yt-1,0)′ and ∂σt2/∂θ=(∂σɛ2/∂θ,yt-12)′, by applying Theorem 2.1, the optimal quadratic estimating function for θ and β based on the martingale differences mt and st is given by gQ*(θ)=∑t=1nat-1*mt+bt-1*st, whereat-1*=(1-〈m,s〉t2〈m〉t〈s〉t)-1((-yt-1〈m〉t+∂σɛ2∂θ〈m,s〉t〈m〉t〈s〉t),yt-12〈m,s〉t〈m〉t〈s〉t)′,bt-1*=(1-〈m,s〉t2〈m〉t〈s〉t)-1((yt-1〈m,s〉t〈m〉t〈s〉t-∂σɛ2∂θ1〈s〉t),-yt-12〈s〉t)′.
Hence, the component quadratic estimating function for θ isgQ*(θ)=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1×((-yt-1〈m〉t+∂σɛ2∂θ〈m,s〉t〈m〉t〈s〉t)mt+(yt-1〈m,s〉t〈m〉t〈s〉t-∂σɛ2∂θ1〈s〉t)st),
and the component quadratic estimating function for β isgQ*(β)=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1(yt-12〈m,s〉tmt〈m〉t〈s〉t-yt-12st〈s〉t).
Moreover, the information matrix of the optimal quadratic estimating function for θ and β is given byIgQ*(θ)=(IθθIθβIβθIββ),
whereIθθ=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1(yt-12〈m〉t+(∂σɛ2∂θ)21〈s〉t-2∂σɛ2∂θyt-1〈m,s〉t〈m〉t〈s〉t),Iθβ=Iβθ=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1(∂σɛ2∂θ1〈s〉t-yt-1〈m,s〉t〈m〉t〈s〉t)yt-12,Iββ=∑t=1n(1-〈m,s〉t2〈m〉t〈s〉t)-1yt-14〈s〉t.
In view of the parameter θ only, the conditional least squares (CLS) estimating function and the associated information are directly given by gCLS(θ)=∑t=1nyt-1mt and ICLS(θ)=(∑t=1nyt-12)2/∑t=1nyt-12〈m〉t. The optimal martingale estimating function and the associated information based on mt are given by gM*(θ)=-∑t=1n(yt-1mt/〈m〉t) and IgM*(θ)=∑t=1n(yt-12/〈m〉t). Moreover, the inequality(∑t=1nyt-12〈m〉t)(∑t=1nyt-12〈m〉t)≥(∑t=1nyt-12)2
implies that ICLS(θ)≤IgM*(θ). Hence the optimal estimating function is more informative than the conditional least squares one. The optimal quadratic estimating function based on the martingale differences mt and st is given by (3.8) and (3.11), respectively. It is obvious to see that the information of gQ*(θ) is larger than that of gM*(θ). Therefore, we can conclude that for the RCA model, ICLS(θ)≤IgM*(θ)≤IgQ*(θ), and hence, the estimate obtained by solving the optimal quadratic estimating equation is more efficient than the CLS estimate and the estimate obtained by solving the optimal linear estimating equation.
3.3. Doubly Stochastic Time Series Model
Random coefficient autoregressive models we discussed in the previous section are special cases of what Tjøstheim [12] refers to as doubly stochastic time series model. In the nonlinear case, these models are given byyt=θtf(t,Ft-1y)+ɛt,
where {θ+bt} of (3.6) is replaced by a more general stochastic sequence {θt} and yt-1 is replaced by a function of the past, ℱt-1y. Suppose that {θt} is a moving average sequence of the formθt=θ+at+at-1,
where {at} consists of square integrable independent random variables with mean zero and variance σa2. We further assume that {ɛt} and {at} are independent, then E[yt∣ℱt-1y] depends on the posterior mean ut=E[at∣ℱt-1y], and variance vt=E[(at-ut)2∣ℱt-1y] of at. Under the normality assumption of {ɛt} and {at}, and the initial condition y0=0, ut and vt satisfy the following Kalman-like recursive algorithms (see [13, page 439]):ut(θ)=σa2f(t,Ft-1y)(yt-(θ+mt-1)f(t,Ft-1y))σe2(θ)+f2(t,Ft-1y)(σa2+vt-1),vt(θ)=σa2-σa4f2(t,Ft-1y)σe2(θ)+f2(t,Ft-1y)(σa2+vt-1),
where u0=0 and v0=σa2. Hence, the conditional mean and variance of yt are given byμt(θ)=(θ+ut-1(θ))f(t,Ft-1y),σt2(θ)=σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ)),
which can be computed recursively.
Let mt=yt-μt and st=mt2-σt2, then {mt} and {st} are sequences of martingale differences. We can derive that 〈m,s〉t=0, 〈m〉t=σe2(θ)+f2(t,ℱt-1y)(σa2+vt-1(θ)), and 〈s〉t=2σe4(θ)+4f2(t,ℱt-1y)σe2(θ)(σa2+vt-1(θ))+2f4(t,ℱt-1y)(σa2+vt-1(θ))2. The optimal estimating function and associated information based on mt are given bygM*(θ)=-∑t=1nf(t,Ft-1y)(1+∂ut-1(θ)∂θ)mt〈m〉t,IgM*(θ)=∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)/∂θ)2〈m〉t.
Then, the inequality (∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)∂θ)2〈m〉t)(∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)/∂θ)2〈m〉t)≥(∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)∂θ)2)2
implies thatICLS(θ)=(∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)/∂θ)2)2∑t=1nf2(t,Ft-1y)(1+∂ut-1(θ)/∂θ)2〈m〉t≤IgM*(θ),
that is, the optimal linear estimating function gM*(θ) is more informative than the conditional least squares estimating function gCLS(θ).
The optimal estimating function and the associated information based on st are given bygS*(θ)=-∑t=1n(∂σe2(θ)∂θ+f2(t,Ft-1y)∂vt-1(θ)∂θ)st〈s〉t,IgS*(θ)=∑t=1n(∂σe2(θ)∂θ+f2(t,Ft-1y)∂vt-1(θ)∂θ)21〈s〉t.
Hence, by Theorem 2.1, the optimal quadratic estimating function is given by gQ*(θ)=-∑t=1n1σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ))×((f(t,Ft-1y)(1+∂ut-1(θ)∂θ))mt+∂σe2(θ)/∂θ+f2(t,Ft-1y)(∂vt-1(θ)/∂θ)σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ))st).
And the associated information, IgQ*(θ)=IgM*(θ)+IgS*(θ), is given by IgQ*(θ)=∑t=1n1σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ))×(f2(t,Ft-1y)(1+∂ut-1(θ)∂θ)2+(∂σe2(θ)/∂θ+f2(t,Ft-1y)(∂vt-1(θ)/∂θ))2σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ))).
It is obvious to see that the information of gQ* is larger than that of gM* and gS*, and hence, the estimate obtained by solving the optimal quadratic estimating equation is more efficient than the CLS estimate and the estimate obtained by solving the optimal linear estimating equation. Moreover, the relations ∂ut(θ)∂θ=-f2(t,Ft-1y)σa2(1+∂ut-1(θ)/∂θ)(σe2(θ)+f2(t,Ft-1y)σa2+vt-1(θ))(σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ)))2-σa2(yt-f(t,Ft-1y)(θ+ut-1(θ)))(∂σe2(θ)/∂θ+f2(t,Ft-1y)(∂vt-1(θ)/∂θ))(σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ)))2,∂vt(θ)∂θ=σa4f2(t,Ft-1y)(∂σe2(θ)/∂θ+f2(t,Ft-1y)∂vt-1(θ)/∂θ)(σe2(θ)+f2(t,Ft-1y)(σa2+vt-1(θ)))2
can be applied to calculate the estimating functions and associated information recursively.
3.4. Regression Model with ARCH Errors
Consider a regression model with ARCH (s) errors ɛt of the formyt=xtβ+ɛt,
such that E[ɛt∣ℱt-1y]=0, and Var(ɛt∣ℱt-1y)=ht=α0+α1ɛt-12+⋯+αsɛt-s2. In this model, the conditional mean is μt=xtβ, the conditional variance is σt2=ht, and the conditional skewness and excess kurtosis are assumed to be constants γ and κ, respectively. It follows form Theorem 2.1 that the optimal component quadratic estimating function for the parameter vector θ=(β1,…,βr,α0,…,αs)′=(β′,α′)′ is gQ*(β)=1(κ+2)(1-γ2κ+2)-1×∑t=1n1ht2((-ht(κ+2)xt′+2ht1/2γ∑j=1sαjxt′ɛt-j)mt+(ht1/2γxt′-2∑j=1sαjxt′ɛt-j)st),gQ*(α)=1(κ+2)(1-γ2κ+2)-1×∑t=1n1ht2(ht1/2γ(1,ɛt-12,…,ɛt-p2)′mt-∑t=1n(1,ɛt-12,…,ɛt-p2)′st).
Moreover, the information matrix for θ=(β′,α′)′ is given byI=(1-γ2κ+2)-1(IββIβαIαβIαα),
whereIββ=∑t=1n(xt′xtht+4(1,ɛt-12,…,ɛt-s2)′(1,ɛt-12,…,ɛt-s2)ht2(κ+2)),Iβα=-∑t=1n(ht1/2γtxt′-2∑j=1sαjxt′ɛt-j)(1,ɛt-12,…,ɛt-s2)ht2(κ+2),Iαβ=Iβα′=-∑t=1n(1,ɛt-12,…,ɛt-s2)′(ht1/2γxt-2∑j=1sαjxtɛt-j)ht2(κ+2),Iαα=∑t=1n(1,ɛt-12,…,ɛt-s2)′(1,ɛt-12,…,ɛt-s2)ht2(κ+2).
It is of interest to note that when {ɛt} are conditionally Gaussian such that γ=0, κ=0,E[(∑j=1sαjxt′ɛt-j)(1,ɛt-12,…,ɛt-s2)ht2(κ+2)]=0,
the optimal quadratic estimating functions for β and α based on the estimating functions mt=yt-xtβ and st=mt2-ht, are, respectively, given bygQ*(β)=-∑t=1n1ht2(htxt′mt+∑t=1n(∑j=1sαjxt′ɛt-j)st),gQ*(α)=-∑t=1n1ht2(1,ɛt-12,…,ɛt-s2)′st.
Moreover, the information matrix for θ=(β′,α′)′ in (3.28) has Iβα=Iαβ=0,Iββ=∑t=1nhtxt′xt+2(∑j=1sαjxt′ɛt-j)(∑j=1sαjxtɛt-j)ht2,Iαα=∑t=1n(1,ɛt-12,…,ɛt-s2)′(1,ɛt-12,…,ɛt-s2)2ht2.
4. Conclusions
In this paper, we use appropriate martingale differences and derive the general form of the optimal quadratic estimating function for the multiparameter case with dependent observations. We also show that the optimal quadratic estimating function is more informative than the estimating function used in Thavaneswaran and Abraham [2]. Following Lindsay [8], we conclude that the resulting estimates are more efficient in general. Examples based on ACD models, RCA models, doubly stochastic models, and the regression model with ARCH errors are also discussed in some detail. For RCA models and doubly stochastic models, we have shown the superiority of the approach over the CLS method.
GodambeV. P.The foundations of finite sample estimation in stochastic processes198572241942880178010.1093/biomet/72.2.419ZBL0584.62135ThavaneswaranA.AbrahamB.Estimation for nonlinear time series models using estimating equations1988919910893192210.1111/j.1467-9892.1988.tb00457.xZBL0638.62083Naik-NimbalkarU. V.RajarshiM. B.Filtering and smoothing via estimating functions199590429301306132513610.2307/2291154ZBL0818.62082ThavaneswaranA.HeydeC. C.Prediction via estimating functions199977189101167781010.1016/S0378-3758(98)00179-7ZBL0929.62096ChandraS. A.TaniguchiM.Estimating functions for nonlinear time series models. Nonlinear non-Gaussian models and related filtering methods2001531125141182095310.1023/A:1017924722711ZBL0995.62088MerkourisT.Transform martingale estimating functions200735519752000236396010.1214/009053607000000299ZBL1126.62074GhahramaniM.ThavaneswaranA.Combining estimating functions for volatility2009139414491461248513810.1016/j.jspi.2008.07.014ZBL1153.62082LindsayB. G.Using empirical partially Bayes inference for increased efficiency198513391493180374810.1214/aos/1176349646ZBL0601.62044CrowderM.On linear and quadratic estimating functions198774359159790936310.1093/biomet/74.3.591ZBL0635.62077EngleR. F.RussellJ. R.Autoregressive conditional duration: a new model for irregularly spaced transaction data19986651127116210.2307/29996321639411ZBL1055.62571NichollsD. F.QuinnB. G.The estimation of random coefficient autoregressive models. I198011374660557310.1111/j.1467-9892.1980.tb00299.xZBL0495.62083TjøstheimD.Some doubly stochastic time series models198671517283235210.1111/j.1467-9892.1986.tb00485.xShiryayevA. N.198495New York, NY, USASpringerGraduate Texts in Mathematics