An application of the empirical likelihood method to non-Gaussian locally stationary processes is presented. Based on the central limit theorem for locally stationary processes, we give the asymptotic distributions of the maximum empirical likelihood estimator and the empirical likelihood ratio statistics, respectively. It is shown that the empirical likelihood method enables us to make inferences on various important indices in a time series analysis. Furthermore, we give a numerical study and investigate a finite sample property.
1. Introduction
The empirical likelihood is one of the nonparametric methods for a statistical inference proposed by Owen [1, 2]. It is used for constructing confidence regions for a mean, for a class of M-estimates that includes quantile, and for differentiable statistical functionals. The empirical likelihood method has been applied to various problems because of its good properties: generality of the nonparametric method and effectiveness of the likelihood method. For example, we can name applications to the general estimating equations, [3] the regression models [4–6], the biased sample models [7], and so forth. Applications are also extended to dependent observations. Kitamura [8] developed the blockwise empirical likelihood for estimating equations and for smooth functions of means. Monti [9] applied the empirical likelihood method to linear processes, essentially under the circular Gaussian assumption, using a spectral method. For short- and long-range dependence, Nordman and Lahiri [10] gave the asymptotic properties of the frequency domain empirical likelihood. As we named above, some applications to time series analysis can be found but it seems that they were mainly for stationary processes. Although stationarity is the most fundamental assumption when we are engaged in a time series analysis, it is also known that real time series data are generally nonstationary (e.g., economics analysis). Therefore we need to use nonstationary models in order to describe the real world. Recently Dahlhaus [11–13] proposed an important class of nonstationary processes, called locally stationary processes. They have so-called time-varying spectral densities whose spectral structures smoothly change in time.
In this paper we extend the empirical likelihood method to non-Gaussian locally stationary processes with time-varying spectra. First, We derive the asymptotic normality of the maximum empirical likelihood estimator based on the central limit theorem for locally stationary processes, which is stated in Dahlhaus [13, Theorem A.2]. Next, we show that the empirical likelihood ratio converges to a sum of Gamma distribution. Especially, when we consider a stationary case, that is, the time-varying spectral density is independent of a time parameter, the asymptotic distribution becomes the chi-square.
As an application of this method, we can estimate an extended autocorrelation for locally stationary processes. Besides we can consider the Whittle estimation which is stated in Dahlhaus [13].
This paper is organized as follows. Section 2 briefly reviews the stationary processes and explains about the locally stationary processes. In Section 3, we propose the empirical likelihood method for non-Gaussian locally stationary processes and give the asymptotic properties. In Section 4 we give numerical studies on confidence intervals of the autocorrelation for locally stationary processes. Proofs of theorems are given in Section 5.
2. Locally Stationary Processes
The stationary process is a fundamental setting in a time series analysis. If the process {Xt}t∈ℤ is stationary with mean zero, it is known to have the spectral representation:Xt=∫-ππexp(iλt)A(λ)dξ(λ),
where A(λ) is a 2π-periodic complex-valued function with A(-λ)=A(λ)¯, called transfer function, and ξ(λ) is a stochastic process on [-π,π] with ξ(-λ)=ξ(λ)¯ andE[dξ(λ)]=0,Cov(dξ(λ1),dξ(λ2))=η(λ1-λ2),
where η(λ)=∑j=-∞∞δ(λ+2πj) is the 2π-periodic extension of the Dirac delta function. If the process is stationary, the covariance between Xt and Xt+k is independent of time t and a function of only the time lag k. We denote it by γ(k)=Cov(Xt,Xt+k). The Fourier transform of the autocovariance functiong(λ)=12π∑k=-∞∞γ(k)exp(-ikλ)
is called spectral density function. In the expression of (2.1), the spectral density function is written by g(λ)=|A(λ)|2. It is estimated by the periodogram, defined by IT(λ)=(2π)-1|∑t=1TXtexp(-iλt)|2. If one wants to change the weight of each data, we can insert the function h(x) defined on [0,1] into the periodogram: IT(λ)=(2π∑t=1Th(t/T)2)-1|∑t=1Th(t/T)Xtexp(-iλt)|2. The function h(x) is called data taper. Now, we give a simple example of the stationary process below.
Example 2.1.
Consider the following AR(p) process:
∑j=0pajXt-j=εt,
where εt are independent random variables with mean zero and variance 1. In the form of (2.1), this is obtained by letting
A(λ)=12π(∑j=0pajexp(-iλj))-1.
As an extension of the stationary process, Dahlhaus [13] introduced the concept of locally stationary. An example of the locally stationary processes is the following time-varying AR(p) process:∑j=0paj(tT)Xt-j,T=εt,
where aj(u) is a function defined on [0,1] and εt are independent random variables with mean zero and variance 1. If all aj(u) are constant, the process (2.6) reduces to stationary. To define a general class of the locally stationary processes, we can naturally consider the time-varying spectral representationXt,T=∫-ππexp(iλt)A(tT,λ)dξ(λ).
However, it turns out that (2.6) has not exactly but only approximately a solution of the form of (2.7). Therefore, we only require that (2.7) holds approximately. The following is the definition of the locally stationary processes given by Dahlhaus [13].
Definition 2.2.
A sequence of stochastic processes Xt,T(t=1,…,T) is called locally stationary with mean zero and transfer function At,T∘, if there exists a representation
Xt,T=∫-ππexp(iλt)At,T∘(λ)dξ(λ),
where the following holds.
ξ(λ) is a stochastic process on [-π,π] with ξ(-λ)=ξ(λ)¯ and
cum{dξ(λ1),…,dξ(λk)}=η(∑j=1kλj)qk(λ1,…,λk-1)dλ1⋯dλk,
where cum{⋯} denotes the cumulant of kth order, q1=0, q2(λ)=1, |qk(λ1,…,λk-1)|≤constk for all k and η(λ)=∑j=-∞∞δ(λ+2πj) is the 2π-periodic extension of the Dirac delta function.
There exists a constant K and 2π-periodic function A:[0,1]×R→C with A(u,-λ)=A(u,λ)¯ which satisfies
supt,λ|At,T∘(λ)-A(tT,λ)|≤KT-1
for all T; A(u,λ) is assumed to be continuous in u.
The time-varying spectral density is defined by g(u,λ):=|A(u,λ)|2. As an estimator of g(u,λ), we define the local periodogram IN(u,λ) (for even N) as follows:
dN(u,λ)=∑s=1Nh(sN)X[uT]-N/2+s,Texp(-iλs),Hk,N=∑s=1Nh(sN)k,IN(u,λ)=12πH2,N|dN(u,λ)|2.
Here, h:R→R is a data taper with h(x)=0 for x∉[0,1]. Thus, IN(u,λ) is nothing but the periodogram over a segment of length N with midpoint [uT]. The shift from segment to segment is denoted by S, which means we calculate IN with midpoints tj=S(j-1)+N/2(j=1,…,M), where T=S(M-1)+N, or, written in rescaled time, at time points uj:=tj/T. Hereafter we set S=1 rather than S=N. That means the segments overlap each other.
3. Empirical Likelihood Approach for Non-Gaussian Locally Stationary Processes
Consider an inference on a parameter θ∈Θ⊂Rq based on an observed stretch X1,T,…,XT,T. We suppose that information about θ exists through a system of general estimating equations. For short- or long-memory processes, Nordman and Lahiri [10] supposed that θ0, the true value of θ, is specified from the following spectral moment condition:∫-ππϕ(λ,θ0)g(λ)dλ=0,
where ϕ(λ,θ) is an appropriate function depending on θ. Following this manner, we naturally suppose that θ0 satisfies the following time-varying spectral moment condition:∫01∫-ππϕ(u,λ,θ0)g(u,λ)dλdu=0
in a locally stationary setting. Here ϕ:[0,1]×[-π,π]×Rq→Cq is a function depending on θand satisfies Assumption 3.4(i). We give brief examples of ϕand corresponding θ0.
Example 3.1 (autocorrelation).
Let us set
ϕ(u,λ,θ)=θ-exp(iλk).
Then (3.2) leads to
θ0=∫01∫-ππexp(iλk)g(u,λ)dλdu∫01∫-ππg(u,λ)dλdu.
When we consider the stationary case, that is, g(u,λ) is independent of the time parameter u, (3.4) becomes
θ0=∫-ππexp(iλk)g(λ)dλ∫-ππg(λ)dλ=γ(k)γ(0)=ρ(k),
which corresponds to the autocorrelation with lag k. So, (3.4) can be interpreted as a kind of autocorrelation with lag k for the locally stationary processes.
Example 3.2 (Whittle estimation).
Consider the problem of fitting a parametric spectral model to the true spectral density by minimizing the disparity between them. For the stationary process, this problem is considered in Hosoya and Taniguchi [14] and Kakizawa [15]. For the locally stationary process, the disparity between the parametric model gθ(u,λ) and the true spectral density g(u,λ) is measured by
L(θ)=14π∫01∫-ππ{loggθ(u,λ)+g(u,λ)gθ(u,λ)}dλdu
and we seek the minimizer
θ0=argminθ∈ΘL(θ).
Under appropriate conditions, θ0 in (3.7) is obtained by solving the equation ∂ℒ(θ)/∂θ=0. Suppose that the fitting model is described as gθ(u,λ)=σ2(u)fθ(u,λ), which means θis free from innovation part σ2(u). Then, by Kolmogorov’s formula (Dahlhaus [11, Theorem 3.2]) we can see that ∫-ππloggθ(u,λ)dλ is independent of θ. So the differential condition on θ0 becomes
∫01∫-ππ∂∂θgθ(u,λ)-1|θ=θ0g(u,λ)dλdu=0.
This is the case when we set
ϕ(u,λ,θ)=∂∂θgθ(u,λ)-1.
Now, we setmj(θ)=∫-ππϕ(uj,λ,θ)IN(uj,λ)dλ(j=1,…,M)
as an estimating function and use the following empirical likelihood ratio function ℛ(θ) defined byR(θ)=maxw{∏j=1MMwj∣∑j=1Mwjmj(θ)=0,wj≥0,∑j=1Mwj=1}.
Denote the maximum empirical likelihood estimator by θ̃, which maximizes the empirical likelihood ratio function ℛ(θ).
Remark 3.3.
We can also use the following alternative estimating function:
mj(T)(θ)=2πT∑t=1Tϕ(uj,Δt,θ)IN(uj,Δt)(Δt=2πtT)
instead of mj(θ) in (3.10). The asymptotic equivalence of mj(θ) and mj(T)(θ) can be proven if
E|mj(T)(θ)-mj(θ)|=o(1)
is satisfied for any j, and this is shown by straightforward calculation.
To show the asymptotic properties of θ̃and ℛ(θ0), we impose the following assumption.
Assumption 3.4.
The functions A(u,λ) and ϕ(u,λ,θ) are 2π-periodic in λ, and the periodic extensions are differentiable in u and λ with uniformly bounded derivative (∂/∂u)(∂/∂λ)A (ϕ, resp.).
The parameters N and T fulfill the relations T1/4≪N≪T1/2/logT.
The data taper h:R→R with h(x)=0 for all x∉(0,1) is continuous on R and twice differentiable at all x∉p where p is a finite set and supx∉p|h′′(x)|<∞.
For k=1,…,8,
qk(λ1,…,λk-1)=ck(constant).
Remark 3.5.
Assumption 3.4(ii) seems to be restrictive. However, this is required to use the central limit theorem for locally stationary processes (cf. Assumption A.1 and Theorem A.2 of Dahlhaus [13]) (Most of the restrictions on N result from the T-unbiasedness in the central limit theorem). See also A.3. Remarks of Dahlhaus [13] for the detail.
Now we give the following theorem.
Theorem 3.6.
Suppose that Assumption 3.4 holds and X1,T,…,XT,T is realization of the locally stationary process which has the representation (2.8). Then,
M(θ̃-θ0)→dN(0,𝚺)
as T→∞, where
𝚺=4π(𝚺3′𝚺2-1𝚺3)-1𝚺3′𝚺2-1𝚺1𝚺2-1𝚺3(𝚺3′𝚺2-1𝚺3)-1.
Here Σ1 and Σ2 are the q by q matrices whose (i,j) elements are
(𝚺1)ij=12π∫01[∫-ππϕi(u,λ,θ0){ϕj(u,λ,θ0)+ϕj(u,-λ,θ0)}g(u,λ)2dλ+c4∫-ππϕi(u,λ,θ0)g(u,λ)dλ∫-ππϕj(u,μ,θ0)g(u,μ)dμ]du,(𝚺2)ij=12π∫01[∫-ππϕi(u,λ,θ0){ϕj(u,λ,θ0)+ϕj(u,-λ,θ0)}g(u,λ)2dλ+∫-ππϕi(u,λ,θ0)g(u,λ)dλ∫-ππϕj(u,μ,θ0)g(u,μ)dμ]du,
respectively, and Σ3 is the q by q matrix which is defined as
𝚺3=∫01∫-ππ∂ϕ(u,λ,θ)∂θ′g(u,λ)dλdu.
In addition, we give the following theorem on the asymptotic property of the empirical likelihood ratio ℛ(θ0).
Theorem 3.7.
Suppose that Assumption 3.4 holds and X1,T,…,XT,T is realization of a locally stationary process which has the representation (2.8). Then,
-1πlogR(θ0)⟶d(FN)′(FN)
as T→∞, where N is a q-dimensional normal random vector with zero mean vector and covariance matrix Iq (identity matrix) and F=Σ2-1/2Σ11/2. Here Σ1 and Σ2 are same matrices in Theorem 3.6.
Remark 3.8.
Denote the eigenvalues of F′F by a1,…,aq, then we can write
(FN)′(FN)=∑i=1qZi,
where Zi is distributed as Gamma(1/2,1/(2ai)), independently.
Remark 3.9.
If the process is stationary, that is, the time-varying spectral density is independent of the time parameter u, we can easily see that Σ1=Σ2 and the asymptotic distribution becomes the chi-square with degree of freedom q.
Remark 3.10.
In our setting, the number of the estimating equations and that of the parameters are equal. In that case, the empirical likelihood ratio at the maximum empirical likelihood estimator, ℛ(θ̃), becomes one (cf. [3, page 305]). That means the test statistic in Theorem 3.7 becomes zero when we evaluate it at the maximum empirical likelihood estimator.
4. Numerical Example
In this section, we present simulation results of the estimation of the autocorrelation in locally stationary processes which is stated in Example 3.1. Consider the following time-varying AR(1) process:Xt,T-a(tT)Xt-1,T=εt fort∈Z,
where εt~i.i.d.Gamma(3/π,(3/π)1/2)-(3/π)1/2 and a(u)=(u-b)2,b=0.1,0.5,0.9. The observations X1,T,…,XT,T are generated from the process (4.1), and we make the confidence intervals of the autocorrelation with lag k=1, which is expressed asθ0=∫01∫-ππeiλg(u,λ)dλdu∫01∫-ππg(u,λ)dλdu,
based on the result of Theorem 3.7. The several combinations of the sample size T and the window length N are chosen: (T,N)=(100,10), (500,10), (500,50), (1000,10), (1000,100), and the data taper is set as h(x)=(1/2){1-cos(2πx)}. Then we calculate the values of the test statistic -π-1logℛ(θ) at numerous points θ and obtain confidence intervals by collecting the points θ which satisfy -π-1logℛ(θ)<zα where zα, is α-percentile of the asymptotic distribution in Theorem 3.7. We admit that Assumption 3.4. (ii) is hard to hold in a finite sample experiment, but this Monte Carlo simulation is purely illustrative and just for investigating how the sample size and the window length affect the results of confidence intervals.
We set a confidence level as α=0.90 and carry out the above procedure 1000 times for each case. Table 1 shows the averages of lower and upper bounds, lengths of the intervals, and the successful rates. Looking at the results, we find out that the larger sample size gives the shorter length of the interval, as expected. Furthermore, the results indicate that the larger window length leads to the worse successful rate. We can predict that the best rate N/T lies around 0.02 because the combination (T,N)=(500,10) seems to give the best result among all.
90% confidence intervals of the autocorrelation with lag k=1.
(T,N)
Lower bound
Upper bound
Interval length
Successful rate
b=0.1,θ0=0.308
(100,10)
0.057
0.439
0.382
0.854
(500,10)
0.172
0.382
0.210
0.866
(500,50)
0.203
0.332
0.129
0.578
(1000,10)
0.203
0.356
0.154
0.826
(1000,100)
0.225
0.308
0.084
0.444
b=0.5,θ0=0.085
(100,10)
−0.087
0.225
0.312
0.890
(500,10)
0.001
0.169
0.168
0.910
(500,50)
0.028
0.104
0.076
0.515
(1000,10)
0.023
0.139
0.116
0.922
(1000,100)
0.047
0.087
0.040
0.384
b=0.9,θ0=0.308
(100,10)
0.060
0.449
0.388
0.841
(500,10)
0.176
0.393
0.216
0.871
(500,50)
0.201
0.332
0.131
0.586
(1000,10)
0.203
0.359
0.156
0.827
(1000,100)
0.226
0.310
0.083
0.467
5. Proofs5.1. Some Lemmas
In this subsection we give the three lemmas to prove Theorems 3.6 and 3.7. First of all, we introduce the following function LN:R→R, which is defined by the 2π-periodic extension ofLN(α)∶={N,|α|≤1N1|α|,1N≤|α|≤π.
The properties of the function LN are described in Lemma A.4 of Dahlhaus [13].
Lemma 5.1.
Suppose (3.2) and Assumption 3.4 hold. Then for 1≤k≤8,cum{dN(u1,λ1),…,dN(uk,λk)}=(2π)k-1ck{∏j=1kA(uj,λj)}exp{-i∑j=1kλj([ukT]-[ujT])}×∑s=1N{∏j=1kh(s+[ukT]-[ujT]N)}exp{-i(∑j=1kλj)s}+O(N2T)+O((logN)k-1)=O(LN(∑j=1kλj))+O(N2T)+O((logN)k-1).
Proof.
Let Π=(-π,π] and let ω=(ω1,…,ωk). Since
cum(Xt1,T,…,Xtk,T)=ck∫Πkexp(i∑j=1kωjtj)(∏j=1kAtj,T∘(ωj))η(∑j=1kωj)dω,
the kth cumulant of dN is equal to
ck∫Πkexp{i∑j=1kωj([ujT]-N2)}η(∑j=1kωj)×∏j=1k∑s=1Nh(sN)A[ujT]-N/2+s,T∘(ωj)exp{-i(λj-ωj)s}dω.
As in the proof of Theorem 2.2 of Dahlhaus [12] we replace A[u1T]-N/2+s1,T∘(ω1) by A(u1+(-N/2+s1)/T,λ1) and we obtain
|∑s=1Nh(sN){A[u1T]-N/2+s,T∘(ω1)-A(u1+-N/2+sT,λ1)}exp{-i(λ1-ω1)s}|≤K
with some constant K while
|∑s=1Nh(sN)A[ujT]-N/2+s,T∘(ωj)exp{-i(λj-ωj)s}|≤KLN(λj-ωj)
for j=2,…,k. The replacement error is smaller than
K∫Πk∏j=2kLN(λj-ωj)dω≤K(logN)k-1.
In the same way we replace A[ujT]-N/2+sj,T∘(ωj) by A(uj+(-N/2+sj)/T,λj) for j=2,…,k, and then we obtain
ck∑s1,…,sk=1N{∏j=1kh(sjN)A(uj+-N/2+sjT,λj)}exp(-i∑j=1kλjsj)×∫Πkη(∑j=1kωj)exp{i∑j=1kωj([ujT]-N2+sj)}dω+O((logN)k-1).
The integral part is equal to
∏j=1k-1∫Πexp{iωj([ujT]-[ukT]+sj-sk)}dωj.
So we get
(2π)k-1ck∑s=1N{∏j=1kh(s+[ukT]-[ujT]N)A(uj+-N/2+s+[ukT]-[ujT]T,λj)}×exp{-i∑j=1kλj(s+[ukT]-[ujT])}+O((logN)k-1).
Since h(x)=0 for x∉(0,1), we only have to consider the range of s which satisfies 1≤s+[ukT]-[ujT]≤N-1. Therefore we can regard (-N/2+s+[ukT]-[ujT])/T as O(N/T), and Taylor expansion of A around uj gives the first equation of the desired result. Moreover, as in the same manner of the proof of Lemma A.5 of Dahlhaus [13] we can see that
∑s=1N{∏j=1kh(s+[ukT]-[ujT]N)}exp{-i(∑j=1kλj)s}=O(LN(∑j=1kλj)),
which leads to the second equation.
Lemma 5.2.
Suppose (3.2) and Assumption 3.4 hold. Then,
PM∶=12πM∑j=1Mmj(θ0)→dN(0,𝚺1).
Proof.
We set
JT(ϕ)∶=1M∑j=1M∫-ππϕ(uj,λ,θ0)IN(uj,λ)dλ,J(ϕ)∶=∫01∫-ππϕ(u,λ,θ0)g(u,λ)dλdu.
Henceforth we denote ϕ(u,λ,θ0) by ϕ(u,λ) for simplicity. This lemma is proved by proving the convergence of the cumulants of all orders. Due to Lemma A.8 of Dahlhaus [13] the expectation of PM is equal to
M2π{J(ϕ)+o(T-1/2)}.
By (3.2) and O(M)=O(T), this converges to zero.
Next, we calculate the covariance of PM. From the relation T=M+N-1 we can rewrite
PM=MTT2πJT(ϕ)=1-N+1TT2πJT(ϕ).
Then the (α,β)-element of the covariance matrix of PM is equal to
1(2π)2(1-N+1T)Tcov{JT(ϕα),JT(ϕβ)}.
Due to Lemma A.9 of Dahlhaus [13], this converges to
12π∫01[∫-ππϕi(u,λ){ϕj(u,λ)+ϕj(u,-λ)}g(u,λ)2dλ+∫∫-ππϕi(u,λ)ϕj(u,μ)g(u,λ)g(u,μ)q4(λ,-λ,μ)dλdμ]du.
By Assumption 3.4(iv) the covariance tends to Σ1.
The kth cumulant for k≥3 tends to zero due to Lemma A.10 of Dahlhaus [13]. Then we obtain the desired result.
Lemma 5.3.
Suppose (3.2) and Assumption 3.4 hold. Then,
SM∶=12πM∑j=1Mmj(θ0)mj(θ0)′→p𝚺2.
Proof.
First we calculate the mean of (α,β)-element of SM:
E[12πM∑j=1Mmj(θ0)mj(θ0)′]αβ=12πM∑j=1M∫∫-ππϕα(uj,λ)ϕβ(uj,μ)E[IN(uj,λ)IN(uj,μ)]dλdμ=12πM∑j=1M∫∫-ππϕα(uj,λ)ϕβ(uj,μ)×[cov{IN(uj,λ),IN(uj,μ)}+EIN(uj,λ)EIN(uj,μ)]dλdμ.
Due to Dahlhaus [12, Theorem 2.2 (i)] the second term of (5.19) becomes
12πM∑j=1M∫-ππϕα(uj,λ){g(uj,λ)+O(N2T2)+O(logNN)}dλ×∫-ππϕβ(uj,μ){g(uj,μ)+O(N2T2)+O(logNN)}dμ=12π∫01{∫-ππϕα(u,λ)g(u,λ)dλ∫-ππϕβ(u,μ)g(u,μ)dμ}+O(1M)+O(N2T2)+O(logNN).
Next we consider
cov{IN(uj,λ),IN(uj,μ)}=1(2πH2,N)2[cum{dN(uj,λ),dN(uj,μ)}cum{dN(uj,-λ),dN(uj,-μ)}+cum{dN(uj,λ),dN(uj,-μ)}cum{dN(uj,-λ),dN(uj,μ)}+cum{dN(uj,λ),dN(uj,-λ),dN(uj,μ),dN(uj,-μ)}].
We calculate the three terms separately. From Lemma 5.1 the first term of (5.21) is equal to
1(2πH2,N)2{2πA(uj,λ)A(uj,μ)∑s=1Nh(sN)2exp{-i(λ+μ)s}+O(N2T)+O(logN)}×{2πA(uj,-λ)A(uj,-μ)∑s=1Nh(sN)2exp{-i(-λ-μ)s}+O(N2T)+O(logN)}.
It converges to zero when λ≠-μ and is equal to
g(uj,λ)2+O(NT)+O(logNN)
when λ=-μ. Similarly the second term of (5.21) converges to zero when λ≠μ and is equal to (5.23) when λ=μ. We can also apply Lemma 5.1 to the third term of (5.21), and analogous calculation shows that it converges to zero. After all we can see that (5.19) converges to (Σ2)αβ, the (α,β)-element of Σ2.
Next we calculate the second-order cumulant:
cum{[12πM∑j=1Mmj(θ0)mj(θ0)′]α1β1,[12πM∑j=1Mmj(θ0)mj(θ0)′]α2β2}.
This is equal to
(2πM)-2(2πH2,N)-4∑j1=1M∑j2=1M∫∫∫∫-ππϕα1(uj1,λ1)ϕβ1(uj1,μ1)ϕα2(uj2,λ2)ϕβ2(uj2,μ2)×cum{dN(uj1,λ1)dN(uj1,-λ1)dN(uj1,μ1)dN(uj1,-μ1),dN(uj2,λ2)dN(uj2,-λ2)dN(uj2,μ2)dN(uj2,-μ2)}dλ1dμ1dλ2dμ2.
Using the product theorem for cumulants (cf. [16, Theorem 2.3.2]) we have to sum over all indecomposable partitions {P1,…,Pm} with |Pi|≥2 of the scheme
dN(uj1,λ1)dN(uj1,-λ1)dN(uj1,μ1)dN(uj1,-μ1)dN(uj2,λ2)dN(uj2,-λ2)dN(uj2,μ2)dN(uj2,-μ2).
We can apply Lemma 5.1 to all cumulants which is seen in (5.25), and the dominant term of the cumulants is o(N4) so (5.25) tends to zero. Then we obtain the desired result.
5.2. Proof of Theorem 3.6
Using the lemmas in Section 5.1, we prove Theorem 3.6. To find the maximizing weights wj′s of (3.11), we proceed by the Lagrange multiplier method. Write G=∑j=1Mlog(Mwj)-Mα′∑j=1Mwjmj(θ)+γ(∑j=1Mwj-1),
where α∈Rq and γ∈R are Lagrange multipliers. Setting ∂G/∂wj=0 gives∂G∂wj=1wj-Mα′mj(θ)+γ=0.
So the equation ∑j=1Mwj(∂G/∂wj)=0 gives γ=-M. Then, we may writewj=1M11+α′mj(θ),
where the vector α=α(θ0) satisfies q equations given by1M∑j=1Mmj(θ)1+α′mj(θ)=0.
Therefore, θ̃is a minimizer of the following (minus) empirical log likelihood ratio functionl(θ)∶=∑j=1Mlog{1+α′mj(θ)}
and satisfies0=∂l(θ)∂θ|θ=θ̃=∑j=1M(∂α′(θ)/∂θ)mj(θ)+(∂mj′(θ)/∂θ)α(θ)1+α′(θ)mj(θ)|θ=θ̃=∑j=1M(∂mj′(θ)/∂θ)α(θ)1+α′(θ)mj(θ)|θ=θ̃.
DenoteQ1M(θ,α)∶=1M∑j=1Mmj(θ)1+α′(θ)mj(θ),Q2M(θ,α)∶=1M∑j=1M11+α′(θ)mj(θ)∂mj′(θ)∂θα(θ).
Then, from (5.30) and (5.32), we have0=Q1M(θ̃,α̃)=Q1M(θ0,0)+∂Q1M(θ0,0)∂θ′(θ̃-θ0)+∂Q1M(θ0,0)∂α′(α̃-0)+op(δM),0=Q2M(θ̃,α̃)=Q2M(θ0,0)+∂Q2M(θ0,0)∂θ′(θ̃-θ0)+∂Q2M(θ0,0)∂α′(α̃-0)+op(δM),
where α̃=α(θ̃)andδM=||θ̃-θ0||+||α̃||. Let us see the asymptotic properties of the above four derivatives. First,∂Q1M(θ0,0)∂θ′=1M∑j=1M∂mj(θ0)∂θ′=1M∑j=1M∫-ππ∂ϕ(uj,λ,θ)∂θ′IN(uj,λ)dλ.
From Lemmas A.8 and A.9 of Dahlhaus [13], we haveE[∂Q1M(θ0,0)∂θ′]=∫01∫-ππ∂ϕ(u,λ,θ)∂θ′g(u,λ)dλdu+o(M-1/2),cov[[∂Q1M(θ0,0)∂θ′]α1β1,[∂Q1M(θ0,0)∂θ′]α2β2]=O(M-1),
which leads to∂Q1M(θ0,0)∂θ′→p∫01∫-ππ∂ϕ(u,λ,θ)∂θ′g(u,λ)dλdu=𝚺3.
Similarly, we have∂Q2M(θ0,0)∂α′=1M∑j=1M∂mj(θ0)′∂θ→p𝚺3′.
Next, from Lemma 5.3, we obtain∂Q1M(θ0,0)∂α′=-1M∑j=1Mmj(θ0)mj(θ0)′→p-2π𝚺2.
Finally, we have∂Q2M(θ0,0)∂θ′=0.
Now, (5.34), (5.35) and (5.38)–(5.41) give(α̃θ̃-θ0)=(∂Q1M∂α′∂Q1M∂θ′∂Q2M∂α′∂Q2M∂θ′)(θ0,0)-1(-Q1M(θ0,0)+op(δM)op(δM)),
where(∂Q1M∂sα′∂Q1M∂θ′∂Q2M∂α′∂Q2M∂θ′)(θ0,0)→p(-2π𝚺2𝚺3𝚺3′0).
Because of Lemma 5.2, we haveQ1M(θ0,0)=1M∑j=1Mmj(θ0)=Op(M-1/2).
From this and the relation (5.42), (5.43), we can see that δM=Op(M-1/2). Again, from (5.42), (5.43), and Lemma 5.2, direct calculation gives thatM(θ̃-θ0)→dN(0,𝚺).
5.3. Proof of Theorem 3.7
Using the lemmas in Section 5.1, we prove Theorem 3.7. The proof is the same as that of Theorem 3.6 up to (5.30). Let α=∥α∥e where ∥e∥=1, and we introduceYj∶=α′mj(θ0),ZM*∶=max1≤j≤M‖mj(θ0)‖.
Note 1/(1+Yj)=1-Yj/(1+Yj) and from (5.30) we find thate′{1M∑j=1M(1-Yj1+Yj)mj(θ0)}=0,e′(1M∑j=1Mα′mj(θ0)1+Yjmj(θ0))=e′(1M∑j=1Mmj(θ0)),‖α‖e′(1M∑j=1Mmj(θ0)mj(θ0)′1+Yj)e=e′(1M∑j=1Mmj(θ0)).
Every wj>0, so 1+Yj>0, and therefore by (5.47), we get‖α‖e′SMe≤‖α‖e′(12πM∑j=1Mmj(θ0)mj(θ0)′1+Yj)e⋅(1+maxjYj)≤‖α‖e′(12πM∑j=1Mmj(θ0)mj(θ0)′1+Yj)e⋅(1+‖α‖ZM*)=e′M-1/2PM(1+‖α‖ZM*),
where SM and PM are defined in Lemmas 5.2 and 5.3. Then by (5.48), we get‖α‖{e′SMe-ZM*e′(M-1/2PM)}≤e′(M-1/2PM).
From Lemmas 5.2 and 5.3 we can see thatM-1/2PM=Op(M-1/2),SM=Op(1).
We evaluate the order of ZM*. We can writeZM*≤max1≤j≤M∫-ππ‖ϕθ0(uj,λ)‖IN(uj,λ)dλ=:max1≤j≤Mmj*(θ0)(say).
Then, for any ε>0,P(max1≤j≤Mmj*(θ0)>εM)≤∑j=1MP(mj*(θ0)>εM)=∑j=1MP(mj*(θ0)3>(εM)3)≤∑j=1M1ε3M3/2E|mj*(θ0)|3=1ε3M3/2∑j=1M∫∫∫-ππ‖ϕθ(uj,λ1)ϕθ(uj,λ2)ϕθ(uj,λ3)‖×E[IN(uj,λ1)IN(uj,λ2)IN(uj,λ3)]dλ1dλ2dλ3.
The above expectation is written asE[IN(uj,λ1)IN(uj,λ2)IN(uj,λ3)]=1(2πH2,N)3cum[dN(uj,λ1)dN(uj,-λ1)dN(uj,λ2)×dN(uj,-λ2)dN(uj,λ3)dN(uj,-λ3)].
From Lemma 5.1 this is of order Op(1), so we can see that (5.52) tends to zero, which leadsZM*=op(M1/2).
From (5.49), (5.50), and (5.54), it is seen that‖α‖[Op(1)-op(M-1/2)Op(M-1/2)]≤Op(M-1/2).
Therefore,‖α‖=Op(M-1/2).
Now we have from (5.54) thatmax1≤t≤T|Yj|=Op(M-1/2)op(M1/2)=op(1)
and from (5.30) that0=1M∑j=1Mmj(θ0)11+Yj=1M∑j=1Mmj(θ0)(1-Yj+Yj21+Yj)=2πM-1/2PM-2πSMα+1M∑j=1Mmj(θ0)Yj21+Yj.
Noting that1M∑j=1M‖mj(θ0)‖3≤1M∑j=1MZM*‖mj(θ0)‖2=op(M1/2),
we can see that the final term in (5.58) has a norm bounded by1M∑j=1M‖mj(θ0)‖3‖α‖2|1+Yj|-1=op(M1/2)Op(M-1)Op(1)=op(M-1/2).
Hence, we can writeα=M-1/2SM-1PM+ϵ,
where ϵ=op(M-1/2). By (5.57), we may writelog(1+Yj)=Yj-12Yj2+ηj,
where for some finite KPr(|ηj|≤K|Yj|3,1≤j≤M)⟶1(T⟶∞).
We may write-1πlogR(θ0)=-1π∑j=1Mlog(Twj)=1π∑j=1Mlog(1+Yj)=1π∑j=1MYj-12π∑j=1MYj2+1π∑j=1Mηj=PM′SM-1PM-Mϵ′SMϵ+1π∑j=1Mηt=(A)-(B)+(C)(say).
Here it is seen that(B)=Mop(M-1/2)Op(1)op(M-1/2)=op(1),(C)≤K‖α‖3∑j=1M‖mj(θ0)‖3=Op(M-3/2)op(M3/2)=op(1).
And finally from Lemmas 5.2 and 5.3, we can show that(A)→d(𝚺2-1/2𝚺11/2𝚺1-1/2PM)′(𝚺2-1/2𝚺11/2𝚺1-1/2PM)→d(FN)′(FN).
Then we can obtain the desired result.
Acknowledgments
The author is grateful to Professor M. Taniguchi, J. Hirukawa, and H. Shiraishi for their instructive advice and helpful comments. Thanks are also extended to the two referees whose comments are useful. This work was supported by Grant-in-Aid for Young Scientists (B) (22700291).
OwenA. B.Empirical likelihood ratio confidence intervals for a single functional198875223724910.1093/biomet/75.2.237946049ZBL0641.62032OwenA. B.Empirical likelihood ratio confidence regions19901819012010.1214/aos/11763474941041387ZBL0712.62040QinJ.LawlessJ.Empirical likelihood and general estimating equations199422130032510.1214/aos/11763253701272085ZBL0799.62049OwenA. B.Empirical likelihood for linear models19911941725174710.1214/aos/11763483681135146ZBL0799.62048ChenS. X.On the accuracy of empirical likelihood confidence regions for linear regression model199345462163710.1007/BF007747771252944ZBL0799.62070ChenS. X.Empirical likelihood confidence intervals for linear regression coefficients1994491244010.1006/jmva.1994.10111275041ZBL0796.62040QinJ.Empirical likelihood in biased sample problems19932131182119610.1214/aos/11763492571241264ZBL0791.62052KitamuraY.Empirical likelihood methods with weakly dependent processes19972552084210210.1214/aos/10693623881474084ZBL0881.62095MontiA. C.Empirical likelihood confidence regions in time series models199784239540510.1093/biomet/84.2.3951467055ZBL0882.62082NordmanD. J.LahiriS. N.A frequency domain empirical likelihood for short- and long-range dependence20063463019305010.1214/0090536060000009022329476ZBL1114.62095DahlhausR.On the Kullback-Leibler information divergence of locally stationary processes199662113916810.1016/0304-4149(95)00090-91388767ZBL0849.60032DahlhausR.Asymptotic statistical inference for nonstationary processes with evolutionary spectra115Proceedings of the Athens Conference on Applied Probability and Time Series Analysis1996Springer145159Lecture Notes in Statistics146674310.1007/978-1-4612-2412-9_11DahlhausR.Fitting time series models to nonstationary processes199725113710.1214/aos/10342766201429916ZBL0871.62080HosoyaY.TaniguchiM.A central limit theorem for stationary processes and the parameter estimation of linear processes1982101132153Correction: vol 21, pp. 1115–1117, 199364272510.1214/aos/1176345696ZBL0484.62102KakizawaY.Parameter estimation and hypothesis testing in stationary vector time series199733322523410.1016/S0167-7152(96)00131-91456697ZBL0899.62117BrillingerD. R.2001San Francisco, Calif, USAHolden-Day1853554ZBL1008.62112