Estimation for Non-Gaussian Locally Stationary Processes with Empirical Likelihood Method

An application of the empirical likelihood method to non-Gaussian locally stationary processes is presented. Based on the central limit theorem for locally stationary processes, we give the asymptotic distributions of the maximum empirical likelihood estimator and the empirical likelihood ratio statistics, respectively. It is shown that the empirical likelihood method enables us to make inferences on various important indices in a time series analysis. Furthermore, we give a numerical study and investigate a finite sample property.


Introduction
The empirical likelihood is one of the nonparametric methods for a statistical inference proposed by Owen 1,2 .It is used for constructing confidence regions for a mean, for a class of M-estimates that includes quantile, and for differentiable statistical functionals.The empirical likelihood method has been applied to various problems because of its good properties: generality of the nonparametric method and effectiveness of the likelihood method.For example, we can name applications to the general estimating equations, 3 the regression models 4-6 , the biased sample models 7 , and so forth.Applications are also extended to dependent observations.Kitamura 8 developed the blockwise empirical likelihood for estimating equations and for smooth functions of means.Monti 9 applied the empirical likelihood method to linear processes, essentially under the circular Gaussian assumption, using a spectral method.For short-and long-range dependence, Nordman and Lahiri 10 gave the asymptotic properties of the frequency domain empirical likelihood.As we named above, some applications to time series analysis can be found but it seems that they were mainly for stationary processes.Although stationarity is the most fundamental assumption when we are engaged in a time series analysis, it is also known that real time series data are generally nonstationary e.g., economics analysis .Therefore we need to use nonstationary models in order to describe the real world.Recently  proposed an important class of nonstationary processes, called locally stationary processes.They have so-called time-varying spectral densities whose spectral structures smoothly change in time.
In this paper we extend the empirical likelihood method to non-Gaussian locally stationary processes with time-varying spectra.First, We derive the asymptotic normality of the maximum empirical likelihood estimator based on the central limit theorem for locally stationary processes, which is stated in Dahlhaus 13, Theorem A.2 .Next, we show that the empirical likelihood ratio converges to a sum of Gamma distribution.Especially, when we consider a stationary case, that is, the time-varying spectral density is independent of a time parameter, the asymptotic distribution becomes the chi-square.
As an application of this method, we can estimate an extended autocorrelation for locally stationary processes.Besides we can consider the Whittle estimation which is stated in Dahlhaus 13 .This paper is organized as follows.Section 2 briefly reviews the stationary processes and explains about the locally stationary processes.In Section 3, we propose the empirical likelihood method for non-Gaussian locally stationary processes and give the asymptotic properties.In Section 4 we give numerical studies on confidence intervals of the autocorrelation for locally stationary processes.Proofs of theorems are given in Section 5.

Locally Stationary Processes
The stationary process is a fundamental setting in a time series analysis.If the process {X t } t∈Z is stationary with mean zero, it is known to have the spectral representation: where A λ is a 2π-periodic complex-valued function with A −λ A λ , called transfer function, and ξ λ is a stochastic process on −π, π with ξ −λ ξ λ and where η λ ∞ j −∞ δ λ 2πj is the 2π-periodic extension of the Dirac delta function.If the process is stationary, the covariance between X t and X t k is independent of time t and a function of only the time lag k.We denote it by γ k Cov X t , X t k .The Fourier transform of the autocovariance function is called spectral density function.In the expression of 2.1 , the spectral density function is written by g λ |A λ | 2 .It is estimated by the periodogram, defined by If one wants to change the weight of each data, we can insert the function h x defined on 0, 1 into the periodogram: The function h x is called data taper.Now, we give a simple example of the stationary process below.
Example 2.1.Consider the following AR p process: where ε t are independent random variables with mean zero and variance 1.In the form of 2.1 , this is obtained by letting

2.5
As an extension of the stationary process, Dahlhaus 13 introduced the concept of locally stationary.An example of the locally stationary processes is the following time-varying AR p process: where a j u is a function defined on 0, 1 and ε t are independent random variables with mean zero and variance 1.If all a j u are constant, the process 2.6 reduces to stationary.
where cum{• • • } denotes the cumulant of kth order, q 1 0, q 2 λ 1, ii There exists a constant K and 2π-periodic function for all T ; A u, λ is assumed to be continuous in u.
The time-varying spectral density is defined by g u, λ : |A u, λ | 2 .As an estimator of g u, λ , we define the local periodogram I N u, λ for even N as follows:

2.11
Here, h : R → R is a data taper with h x 0 for x / ∈ 0, 1 .Thus, I N u, λ is nothing but the periodogram over a segment of length N with midpoint uT .The shift from segment to segment is denoted by S, which means we calculate I N with midpoints t j S j −1 N/2 j 1, . . ., M , where T S M − 1 N, or, written in rescaled time, at time points u j : t j /T .Hereafter we set S 1 rather than S N.That means the segments overlap each other.

Empirical Likelihood Approach for Non-Gaussian Locally Stationary Processes
Consider an inference on a parameter θ ∈ Θ ⊂ R q based on an observed stretch X 1,T , . . ., X T,T .We suppose that information about θ exists through a system of general estimating equations.
For short-or long-memory processes, Nordman and Lahiri 10 supposed that θ 0 , the true value of θ, is specified from the following spectral moment condition: where φ λ, θ is an appropriate function depending on θ.Following this manner, we naturally suppose that θ 0 satisfies the following time-varying spectral moment condition: in a locally stationary setting.Here φ : 0, 1 × −π, π × R q → C q is a function depending on θ and satisfies Assumption 3.

3.4
When we consider the stationary case, that is, g u, λ is independent of the time parameter u, 3.4 becomes which corresponds to the autocorrelation with lag k.So, 3.4 can be interpreted as a kind of autocorrelation with lag k for the locally stationary processes.
Example 3.2 Whittle estimation .Consider the problem of fitting a parametric spectral model to the true spectral density by minimizing the disparity between them.For the stationary process, this problem is considered in Hosoya and Taniguchi 14 and Kakizawa 15 .For the locally stationary process, the disparity between the parametric model g θ u, λ and the true spectral density g u, λ is measured by and we seek the minimizer Under appropriate conditions, θ 0 in 3.7 is obtained by solving the equation ∂L θ /∂θ 0.
Suppose that the fitting model is described as g θ u, λ σ 2 u f θ u, λ , which means θ is free from innovation part σ 2 u .Then, by Kolmogorov's formula Dahlhaus 11,Theorem 3.2 Advances in Decision Sciences we can see that π −π log g θ u, λ dλ is independent of θ.So the differential condition on θ 0 becomes This is the case when we set Now, we set as an estimating function and use the following empirical likelihood ratio function R θ defined by Denote the maximum empirical likelihood estimator by θ, which maximizes the empirical likelihood ratio function R θ .
Remark 3.3.We can also use the following alternative estimating function: instead of m j θ in 3.10 .The asymptotic equivalence of m j θ and m is satisfied for any j, and this is shown by straightforward calculation.
To show the asymptotic properties of θ and R θ 0 , we impose the following assumption.Assumption 3.4.i The functions A u, λ and φ u, λ, θ are 2π-periodic in λ, and the periodic extensions are differentiable in u and λ with uniformly bounded derivative ∂/∂u ∂/∂λ A φ, resp. .
ii The parameters N and T fulfill the relations iii The data taper h : R → R with h x 0 for all x / ∈ 0, 1 is continuous on R and twice differentiable at all x / ∈ p where p is a finite set and sup x/ ∈p |h x | < ∞. iv For k 1, . . ., 8, Remark 3.5.Assumption 3.4 ii seems to be restrictive.However, this is required to use the central limit theorem for locally stationary processes cf.Assumption A.1 and Theorem A.2 of Dahlhaus 13 Most of the restrictions on N result from the √ T -unbiasedness in the central limit theorem .See also A.3.Remarks of Dahlhaus 13 for the detail.Now we give the following theorem.Theorem 3.6.Suppose that Assumption 3.4 holds and X 1,T , . . ., X T,T is realization of the locally stationary process which has the representation 2.8 .Then,

3.16
Here Σ 1 and Σ 2 are the q by q matrices whose i, j elements are 3.17

3.18
respectively, and Σ 3 is the q by q matrix which is defined as

3.19
In addition, we give the following theorem on the asymptotic property of the empirical likelihood ratio R θ 0 .Theorem 3.7.Suppose that Assumption 3.4 holds and X 1,T , . . ., X T,T is realization of a locally stationary process which has the representation 2.8 .Then, as T → ∞, where N is a q-dimensional normal random vector with zero mean vector and covariance matrix I q (identity matrix) and Here Σ 1 and Σ 2 are same matrices in Theorem 3.6.
Remark 3.8.Denote the eigenvalues of F F by a 1 , . . ., a q , then we can write where Z i is distributed as Gamma 1/2, 1/ 2a i , independently.
Remark 3.9.If the process is stationary, that is, the time-varying spectral density is independent of the time parameter u, we can easily see that Σ 1 Σ 2 and the asymptotic distribution becomes the chi-square with degree of freedom q.Remark 3.10.In our setting, the number of the estimating equations and that of the parameters are equal.In that case, the empirical likelihood ratio at the maximum empirical likelihood estimator, R θ , becomes one cf.3, page 305 .That means the test statistic in Theorem 3.7 becomes zero when we evaluate it at the maximum empirical likelihood estimator.

Numerical Example
In this section, we present simulation results of the estimation of the autocorrelation in locally stationary processes which is stated in Example 3.1.Consider the following time-varying AR 1 process: ∼ Gamma 3/π, 3/π 1/2 − 3/π 1/2 and a u u − b 2 , b 0.1, 0.5, 0.9.The observations X 1,T , . . ., X T,T are generated from the process 4.1 , and we make the confidence intervals of the autocorrelation with lag k 1, which is expressed as  collecting the points θ which satisfy −π −1 log R θ < z α where z α , is α-percentile of the asymptotic distribution in Theorem 3.7.We admit that Assumption 3.4.ii is hard to hold in a finite sample experiment, but this Monte Carlo simulation is purely illustrative and just for investigating how the sample size and the window length affect the results of confidence intervals.
We set a confidence level as α 0.90 and carry out the above procedure 1000 times for each case.Table 1 shows the averages of lower and upper bounds, lengths of the intervals, and the successful rates.Looking at the results, we find out that the larger sample size gives the shorter length of the interval, as expected.Furthermore, the results indicate that the larger window length leads to the worse successful rate.We can predict that the best rate N/T lies around 0.02 because the combination T, N 500, 10 seems to give the best result among all.

Some Lemmas
In this subsection we give the three lemmas to prove Theorems 3.6 and 3.7.First of all, we introduce the following function L N : R → R, which is defined by the 2π-periodic extension of

5.1
The properties of the function L N are described in Lemma A.4 of Dahlhaus 13 .

Advances in Decision Sciences
Lemma 5.1.Suppose 3.2 and Assumption 3.4 hold.Then for

5.4
As in the proof of Theorem 2.2 of Dahlhaus 12 we replace A with some constant K while for j 2, . . ., k.The replacement error is smaller than

5.7
In the same way we replace A • u j T −N/2 s j ,T ω j by A u j −N/2 s j /T, λ j for j 2, . . ., k, and then we obtain

5.8
The integral part is equal to 5.9 So we get

5.10
Since h x 0 for x / ∈ 0, 1 , we only have to consider the range of s which satisfies 1 ≤ s u k T − u j T ≤ N − 1. Therefore we can regard −N/2 s u k T − u j T /T as O N/T , and Taylor expansion of A around u j gives the first equation of the desired result.Moreover, as in the same manner of the proof of Lemma A.5 of Dahlhaus 13 we can see that which leads to the second equation.

5.12
Proof.We set

5.13
Henceforth we denote φ u, λ, θ 0 by φ u, λ for simplicity.This lemma is proved by proving the convergence of the cumulants of all orders.Due to Lemma A.8 of Dahlhaus 13 the expectation of P M is equal to Then the α, β -element of the covariance matrix of P M is equal to

5.17
By Assumption 3.4 iv the covariance tends to Σ 1 .
The kth cumulant for k ≥ 3 tends to zero due to Lemma A.10 of Dahlhaus 13 .Then we obtain the desired result. 5.18 Proof.First we calculate the mean of α, β -element of S M : EI N u j , λ EI N u j , μ dλ dμ.

5.19
Due to Dahlhaus 12, Theorem 2.2 i the second term of 5.19 becomes

5.20
Next we consider cov I N u j , λ , I N u j , μ

Advances in Decision Sciences
We calculate the three terms separately.From Lemma 5.1 the first term of 5.21 is equal to

5.22
It converges to zero when λ / − μ and is equal to when λ −μ.Similarly the second term of 5.21 converges to zero when λ / μ and is equal to 5.23 when λ μ.We can also apply Lemma 5.1 to the third term of 5.21 , and analogous calculation shows that it converges to zero.After all we can see that 5.19 converges to Σ 2 αβ , the α, β -element of Σ 2 .
Next we calculate the second-order cumulant: cum This is equal to

5.25
Using the product theorem for cumulants cf.16, Theorem 2.3.2 we have to sum over all indecomposable partitions {P 1 , . . ., P m } with |P i | ≥ 2 of the scheme

5.26
We can apply Lemma 5.1 to all cumulants which is seen in 5.25 , and the dominant term of the cumulants is o N 4 so 5.25 tends to zero.Then we obtain the desired result.

Proof of Theorem 3.6
Using the lemmas in Section 5.1, we prove Theorem 3.6.To find the maximizing weights w j s of 3.11 , we proceed by the Lagrange multiplier method.Write where α ∈ R q and γ ∈ R are Lagrange multipliers.Setting ∂G/∂w j 0 gives ∂G ∂w j 1 w j − Mα m j θ γ 0.

5.28
So the equation M j 1 w j ∂G/∂w j 0 gives γ −M.Then, we may write

Advances in Decision Sciences
Denote

5.33
Then, from 5.30 and 5.32 , we have

5.36
From Lemmas A.8 and A.9 of Dahlhaus 13 , we have

5.38
Similarly, we have

5.39
Next, from Lemma 5.3, we obtain

5.43
Because of Lemma 5.2, we have

5.44
From this and the relation 5.42 , 5.43 , we can see that δ M O p M −1/2 .Again, from 5.42 , 5.43 , and Lemma 5.2, direct calculation gives that

5.52
The above expectation is written as E I N u j , λ 1 I N u j , λ 2 I N u j , λ 3

5.66
Then we can obtain the desired result.
result of Theorem 3.7.The several combinations of the sample size T and the window length N are chosen: T, N 100, 10 , 500, 10 , 500, 50 , 1000, 10 , 1000, 100 , and the data taper is set as h x 1/2 {1 − cos 2πx }.Then we calculate the values of the test statistic −π −1 log R θ at numerous points θ and obtain confidence intervals by

By 3 .
2 and O M O T , this converges to zero.Next, we calculate the covariance of P M .From the relation T M N − 1 we can rewrite

Table 1 :
90% confidence intervals of the autocorrelation with lag k 1.