Let F∧n
be an estimator obtained by integrating a kernel type density estimator based
on a random sample of size n. A central limit theorem is established for the target
statistic F∧n(ξ∧n), where the underlying random vector forms an asymptotically stationary
absolutely regular stochastic process, and
ξ∧n
is an estimator of a multivariate parameter
ξ
by using a vector of U-statistics. The results obtained extend or generalize previous
results from the stationary univariate case to the asymptotically
stationary multivariate case. An example of asymptotically
stationary absolutely regular multivariate ARMA process and an example of a useful
estimation of F(ξ) are given in the applications.
1. Introduction
The purpose of this paper is to estimate the value of a multivariate distribution function, called the target distribution function, at a given point, when observing a nonstationary process. Clearly, there must be a connection between the process and the target distribution. We will assume
that as time goes, the marginal distribution of the process gets closer and closer to the target in a suitable sense. The point at which we want to estimate the target distribution is not any bona fide vector, for we will assume that it can be estimated by a vector of U-statistics. Such a problem is clearly out of reach with that generality, and we will assume that, though nonstationary, the process exhibits an asymptotic form of stationarity and has a suitable mixing property. Those will be defined formally after this general introduction.
Let (Xi)i≥1 be a stochastic process indexed by the positive integers, taking value in a finite dimensional Euclidean space ℍ. Identifying ℍ with a product
of a finite number copies or the real line, we write Fi for the distribution function of Xi. We will assume that the process has some form of asymptotic stationarity, implying that the sequence Fi converges in a sense to be made precise to a limiting distribution function F.
For i≤j, let 𝒜ij denote the σ-algebra of events generated by Xi,…,Xj.
We will say that the nonstationary stochastic process
is absolutely regular if supn∈ℕ*max1≤j≤n−mE{supA∈𝒜j+m∞∣P(A∣𝒜1j)−P(A)∣}=β(m)↓0asn→∞, where ℕ*={1, 2, …}.
Assume that for some positive δ less than 2/5, β(n)=𝒪(n−(2+δ)/δ).
We consider a parameter ξ in ℍ whose
components can be naturally estimated by U-statistics. To be more formal and precise, we assume that ξ is defined as follows. Let m be an integer to be the degree of the U-statistics. Let Φ be a function from ℍm into ℍ, invariant by permutation of its arguments. We are
interested in parameters of the form ξ=∫ℍmΦdF⊗m=∫ℍmΦ(x1,…,xm)∏l=1mdF(xl), and the function Φ is called the kernel of the parameter ξ.
Example 1.1.
Take ℍ to be ℝ. The mean vector corresponds to taking m=1, and Φ is the
identity.
Example 1.2.
Take ℍ to be ℝ2. Consider ξ to be the 2-dimensional vector whose components are the marginal variances. We take m=2 and Φ is going to be a function defined on (ℝ2)2. It has two arguments, each being in ℝ2, and it is defined by Φ((u,v),(u′,v′))=(u2+u′22−uu′,v2+v′22−vv′).
Such a parameter can be estimated naturally by U-statistics, essentially replacing F⊗m in (1.3) by an empirical counterpart. By using the invariance of Φ, the estimator of ξ is then of the form ξ^n=(nm)−1∑1≤i1<⋯<im≤nΦ(Xi1,…,Xim).
Now, we have described the parameter ξ and its
estimator, going back to our problem, we need to define an estimator of the distribution function F.
A natural one would be the empirical distribution
function calculated on the observed values of the process. Even if the empirical distribution function is optimal with respect to the speed of convergence of the mean square error, it is not appropriate for not taking care of the fact that F is smooth, and
particularly of the existence of a density f.
It is, therefore, natural to seek an estimator of the
target distribution which is smooth. A good candidate is to smooth the empirical distribution function with a kernel. Another way to introduce this estimator is to say that we integrate a standard kernel estimator of the density.
Such an estimator estimates the mean distribution
function F¯n=n−1∑1≤i≤nFi. But since the sequence Fi has a limit F, it estimates the limit F as well. To be explicit, we consider a sequence Kn, n≥1, of distribution function converging, in the usual sense of convergence in distribution, to that of the point mass at the origin. We write Fn for the empirical distribution function pertaining to the measure having mass 1/n at each sample point Xi,1≤i≤n. Our nonparametric estimator of F is
F^n=Fn⋆Kn,
where ⋆ denotes the
convolution operator. Finally, our estimator of F(ξ) is F^n(ξ^n).
Our method is the adaption of some of the ideas of
Puri and Ralescu [1]
who proved a central limit theorem of F^n(ξ^n) for the i.i.d.
case which was generalized by Sun [2] for the stationary absolutely regular case, then Sun
[3] proved the
asymptotic normality of F^n and the
perturbed sample quantiles for the nonstationary strong mixing condition. We
also have to mention Harel and Puri [4, 5] who proved central limit theorems of U-statistics for
nonstationary (not necessarily bounded) strong mixing double array of random
variables, Ducharme and El Mouvid [6] who proved limit theorems for the conditional
cumulative distribution function by using the convergence of the rapport of two
U-statistics, and Oodaira and Yoshihara [7] who obtained the law of the iterated logarithm for the sum of random variables satisfying the absolute regularity, then Harel and Puri [8] proved the law of the iterated logarithm for perturbed empirical
distribution function when the random variables are nonstationary absolutely regular; later, this result was generalized for the strong mixing condition by Sun and Chiang [9]. In addition, some of the ideas of Billingsley [10] and Yoshihara [11] have been used to study our
problem. For the study of some limit theorems dealing with U-statistic for
processes which are uniformly mixing in both directions of time, the reader is
also referred to Denker and Keller [12].
2. Preliminaries
To specify our assumption on the process, it is
convenient to introduce copies of ℍ. Hence we write ℍi,i≥1, an infinite sequence of copies of ℍ. The basic idea is to think of the process at time i as taking value
in ℍi and we think of
each ℍi as the ith component
of ℍ∞. We then agree
on the following definition.
Definition 2.1.
A canonical p-subspace of ℍ∞ is any subspace of the form ℍi1⊕⋯⊕ℍipwith 1≤i1<⋯<ip. We write 𝕊p for a generic canonical p-subspace.
Remark 2.2.
For (i1,…,ip)≠(j1,…,jp), if we note 𝕊p=ℍi1⊕⋯⊕ℍip and 𝕊p′=ℍj1⊕⋯⊕ℍjp, we have 𝕊p≠𝕊p′ with 𝕊p⊂ℍ∞ and 𝕊p′⊂ℍ∞.
The origin of this terminology is that when ℍ is the real
line, then a canonical p-subspace is a
subspace spanned by exactly p distinct vectors of the canonical basis of ℍ∞. We write ∑𝕊p⊂ℍn for a sum over all canonical p-subspaces
included in ℍn.
To such a canonical subspace 𝕊p=ℍi1⊕⋯⊕ℍip, we can associate the distribution function F𝕊p of (Xi1, …, Xip) as well as the
distribution function with the same marginals F⊗𝕊p=⊗1≤j≤pFij=⊗ℍi⊂𝕊iFi. Clearly, the marginals of F⊗𝕊p are independent, while these of F𝕊p are not.
Consider two nested canonical subspaces 𝕊p and 𝕊m−p, where 𝕊m−p⊂ℍn⊖𝕊p. For a function ϕ symmetric in its argument and defined on 𝕊p⊕𝕊m−p, we can define its projection onto the functions defined on 𝕊p by x∈𝕊p→ϕ(x,𝕊m−p)=∫𝕊m−pϕ(x,y)dF⊗𝕊m−p(y).
Identifying 𝕊p⊕𝕊m−p with ℍm and ℍp with 𝕊p allows to project functions defined on ℍm onto functions on ℍp. However, with this identification, the projection depends on the particular choice of 𝕊m−p in ℍn. To remove the dependence in 𝕊m−p, we sum over all choices of 𝕊m−p in ℍn⊖𝕊p by ϕ𝕊p(x)=(n−pm−p)−1∑𝕊m−p⊂ℍn⊖𝕊pϕ(x,𝕊m−p).
Given U-statistics of degree m, Un=(nm)−1∑1≤i1<⋯<im≤nϕ(Xi1,…,Xim), we can then define an analogue of Hoeffding decomposition (e.g., Hoeffding
[13]) when the random variables come from a nonstationary process. For this purpose, consider,
firstly, an expectation of Un if the process had no dependence, namely, Un, 0=(nm)−1∑𝕊m⊂ℍn∫𝕊mϕdF⊗𝕊m.
Then for any p=1,…,m, we define Un,p=(np)−1∑𝕊p⊂ℍn∫𝕊pϕ𝕊pd⊗ℍi⊂𝕊p(δXi−Fi), where δ{⋅} is the Dirac function.
Finally, for p>m, we set Un,p=0. The analogue of Hoeffding decomposition is the equality Un=∑0≤p≤m(mp)Un,p.
When we have a vector of U-statistics defined by a function Φ as in (1.3), we
can write the decomposition componentwise. This is a little cumbersome to write
explicitly. Identifying ℍ with ℝd says, we write ξ^n=(ξ^n,j)1≤j≤d and each U-statistics ξ^n,j
has a Hoeffding type decomposition ξ^n,j=∑0≤p≤mj(mjp)ξ^n,j,p, where ξ^n,j,p=∑𝕊p⊂ℍn∫𝕊pϕj,𝕊pd⊗ℍi⊂𝕊p(δXi−Fi)
and ϕj,𝕊p
is defined by (2.3) for the component ϕj of Φ.
We can construct the vector ξ^n,⋅,p=((mjp)ξ^n,j,p)1≤j≤d. Now, writing m for the largest
of the mj's, we can write a vector version of the Hoeffding decomposition ξ^n=∑0≤p≤mξ^n,⋅,p. Note that this decomposition makes an explicit use of convention (2.7), and this
is why this convention was introduced.
We now need to specify exactly what we mean by asymptotic stationary of a process. For this, recall the following notion of distance
between probability measures.
Definition 2.3.
The distance in total variation
between two probability measures P and Q defined on the same σ-algebra 𝒜
is |P−Q|𝒜=supA∈𝒜|P(A)−Q(A)|.If 𝕊p is a canonical subspace of ℍ∞, we write σ𝕊p as the σ-algebra generated by the Xi's with ℍi⊂𝕊p. We write P as the probability measure pertaining to the process (Xi)i≥1, which is a probability measure on ℍ∞.
Definition 2.4.
The process (Xi)i≥1
with probability measure P on ℍ∞ is geometrically asymptotically (pairwise) stationary if there exists a strictly stationary process with distribution Q on ℍ∞, and a positive τ less than 1, such that for 1≤i<j, |P−Q|σℍi⊕ℍj≤τi.Since Q is strictly
stationary in this definition, its restriction to σℍi⊕ℍj depends in fact
only on j−i. Hence, this definition asserts that the process Xi,Xi+1,… is very close
of being stationary when i is large. It
also implies that |P−Q|σℍi≤τi.
This asserts that the marginal distribution of the process converges
geometrically fast to a fixed distribution.
We suppose that there exists a strictly stationary
process (Xi*)i≥1 with
probability measure Q on ℍ∞, which is absolutely regular with the same rate as
the process (Xi)i≥1. F is the
distribution function of Xi*.
We define the function ϕ* on ℍ1 by x∈ℍ1↦ϕ*(x,ℍm⊖ℍ1)=∫ℍm⊖ℍ1ϕ(x,y)dF⊗(m−1).
Next, we denote ξl, 1*=∫ℍlϕ*d(δXl*−F). Identifying ℍ with ℝd says, the
vector of U-statistics being defined by a vector function Φ, we can write ξl, 1*=(ξl,j, 1*)1≤j≤d. We can construct the vector ξl,⋅, 1*=(mjξl,j, 1*)1≤j≤d.
Let Al=s(ξ−Xl*)−F(ξ)+DF(ξ)(ξl,⋅, 1*),
where s(x)=1 for x≥0, s(x)=0 otherwise, and D is the
differential operator.
We haveE(Al)=0,1≤l≤n. We also define σ2=E(A12)+2∑l=1∞E(A1Al+1).
3. Weak Convergence of the Smoothed Empirical Distribution Function
In this section, we identify ℍ with ℝd and we have, of
course, a vector of U-statistics defined obviously by a vector function Φ=(ϕj)1≤j≤d, where the degree of ϕj is mj.
Let k be a
probability density function on ℍ and let (an)n≥1 be a sequence
of positive window-width, tending to zero as n→∞. Denote kn(t)=an−1k(tan−1), Kn(x)=∫t≤xkn(y)dy, and consider the perturbed empirical distribution
function F^n defined by (1.7)
corresponding to the sequence (Kn)n≥1.
Consider the smoothed empirical distribution F^n defined in (1.7)
and using the kernel density estimator f^n, where f^n(x)=(nan)−1∑l=1nk((x−Xl)/an), and define F^n(x)=∫t≤xf^n(t)dt,x∈ℍ.
Note that this is of the form (1.7), with Kn(x)=∫t≤xkn(t)dt, where kn(t)=an−1k(tan−1).
For a better understanding of the use of the integral
type estimators F^n, it is of interest to study the asymptotic behavior
of the distribution of F^n (defined by
(3.1)) evaluated at a random point ξ^n (defined by
(1.5)). Such a statistic is useful in estimating a functional F(ξ) if F is unknown.
Supposing that the conditions introduced in Section 2
are satisfied, our main result establishes that F^n(ξ^n) is an estimator
which converges to F(ξ), and the asymptotic normality will allow us to obtain
confidence intervals for F(ξ). Finally, by using the notations introduced in
Section 2, we can write the following result.
Theorem 3.1.
We suppose that
there exists a finite positive constant M0 such that max1≤j≤dsup𝕊mj⊂ℍ∞∫𝕊mj|ϕj|2+δdPσ𝕊mj<M0,max1≤j≤dsup𝕊mj⊂ℍ∞∫𝕊mj|ϕj|2+δdQσ𝕊mj<M0,
where δ is the number
introduced in (1.2);
the mixing rate of absolute regularity verifies
Condition (1.2);
condition (2.14) is verified;
∫ℍ∥t∥k(t)dt<∞,∥t∥=max1≤j≤d|tj|,
where k is a
probability density function;
the sequence Fi and F are twice
differentiable on ℍ with uniformly
bounded first and second partial derivatives.
Then n1/2{F^n(ξ^n)−F(ξ)} converges in
law to a normal distribution 𝒩(0,σ2) as n→∞, where σ2 is defined in (2.22).
We are then faced with a difficulty as the variance σ2 defined in (2.22)
is unknown. In order to overcome this difficulty, we can proceed to an
estimation of the variance σ2 by truncating
the expansion of σ, keeping only
the first I more
informative terms and estimating σ2 by its
empirical counterpart σ^n2 defined by
σ^n2=n−1∑l=1n(A^l−A^¯n)2+2n−1∑i=1I∑l=1n−I(A^l−A^¯n)(A^l+i−A^¯n),
where A^¯n=n−1∑l=1nA^l,A^l=s(ξ^n−Xl)−F^n(ξ^n)+DF^n(ξ^n)(ζ^n,l,⋅, 1),ζ^n,l,⋅, 1=(mjζ^n,l,j, 1)1≤j≤d,ζ^n,l,j, 1=∫ℍlϕ^j,ℍld(Xl−Fn),ϕ^j,ℍl(x)=(nm−1)−1∑1≤i1<⋯<im−1≤mϕ(x,Xi1,…,Xim−1),DF^n(x)=n−1∑l=1nDKn(x−Xl).From condition (1.2), we have|E(A^lA^l+i)|≤{β(i)}δ/(1+δ)M=𝒪(i−(2+δ)/(1+δ)),
where
M is some finite
positive constant.
To obtain a suitable value for I, a simple criterion consists of computing the
smallest integer I for which ∑i=1Ii−(2+δ)/(1+δ)∑i=1∞i−(2+δ)/(1+δ)≥1−α,
where α would be the
needed level of precision.
From the empirical construction of the estimator σ^n2, we deduce easily the convergence in distribution of σ^n2 to σI2=E(A12)+2∑l=1IE(AlAl+1)≃σ2.
4. Applications4.1. Application to an ARMA Process
First, we give an example for which the the stochastic process (Xi)i≥1 satisfies our
general condition. It means that (Xi)i≥1 is a
multivariate asymptotical stationary absolute regular stochastic process.
Example 4.1.
ARMA
process.
Consider a d-variate
ARMA(1,1) process (Xi)i≥1 defined by Xi=AXi−1+Bϵi−1+ϵi;i≥1,
where the initial vector X1 has a measure
which is not necessarily the invariant measure and admits a strictly positive
density, A and B are square
matrices and (ϵi)i≥1 is a d-variate white
noise with strictly positive density and geometrical absolute regularity. If
the eigenvalues of the square matrix A have modulus
strictly less than 1, then the process (Xi)i≥1 satisfying
Condition (2.14) (for a proof, see (5.4) in [8]), the process is asymptotically stationary and
geometrically absolutely regular.
Consider the strictly stationary process (Xi∗)i≥1 satisfying
(4.1), associated to the process (Xi)i≥1 where X1* has a measure
which is the invariant measure. Some parameters of the model (4.1) can be
estimated by estimators of the form (1.5) and we can apply Theorem 3.1.
For example, take d=2 and denote by μ the mean and by Γ(⋅) the covariance
function of the process (Xi∗)i≥1.
Consider ξ to be the
2-dimensional vector which is the column matrix μ, and suppose that we want to estimate μ, then one possibility should be to use the estimator ξ^n, and its associated kernel Φ is, of course,
the identity.
For estimating the two parameters ξ1 and ξ2, where ξ1 is the first
column and ξ2 is the second
column of Γ(⋅), we could also use the estimator ξ^n, where the associated kernel Φ1 of ξ1 is defined by Φ1((u,v),(u′,v′))=(u2+u′22−uu′,12(u−u′)(v−v′)),
and the associated kernel Φ2 of ξ2 is defined by Φ2((u,v),(u′,v′))=(12(u−u′)(v−v′),u2+u′22−uu′).
4.2. Application to Estimation of the Median
We give a very simple example for which it is useful to estimate F(ξ). For simplicity,
we suppose d=1.
Let X1, …, Xn be a random
sample for which the sequence of distribution functions Fi of Xi converges to
the limiting distribution function F with median ξ. A well-known
estimator of ξ is the
Hodges-Lehmann estimator ξ^n defined by ξ^n=median{12(Xi+Xj):1≤i<j≤n}. The ξ^n estimator is a
weighed U-statistic with kernel ϕ(x1,x2)=(1/2)(x1+x2).
The theorems of convergence for U-statistics remain
true for weighted U-statistics. We can easily conclude that ξ^n convergence in
law to ξ, and also F^n(ξ^n) converges in
law to 1/2. We can confront
these two results to evaluate the validity of the estimation of the parameter ξ by ξ^n.
5. Proof of Theorem 3.1
We are going to use the following lemma proved by
Harel and Puri [4, Lemma 2.2], which is a generalization of a lemma of Yoshihara [11, Lemma 2].
Lemma 5.1.
Suppose that (3.2), (3.3), and Condition (ii) of Theorem 3.1 are satisfied, then E(ξ^n,j,p)2=𝒪(n−1−λ);2≤p≤mj,1≤j≤d, where λ is min((2−5δ)/6δ, 1).
Writing F^n(ξ^n)−F(ξ) as
F^n(ξ^n)−F^n(ξ)+F^n(ξ)−F¯n⋆Kn(ξ)+F¯n⋆Kn(ξ)−F(ξ) will allow us to determine the contributions of the stochastic behavior of ξ^n and that of F^n to the limiting distribution.
First, we use smoothness of our nonparametric
estimator to linearize the term F^n(ξ^n)−F^n(ξ), approximating it by the differential DF^n(ξ)(ξ^n−ξ) minus a
centralization term DF¯n(ξ)(ξ^n−ξ). The second term F^n(ξ)−F¯n⋆Kn(ξ), plus the centralization term defined above, is
analyzed using an empirical process technique for dependent random variables.
Finally, the last term satisfies F¯n⋆Kn(ξ)−F(ξ)=o(n−1/2), using the exponential asymptotic stationarity.
Setting Hn, 1(ξ)
=n1/2{F^n(ξ)−F¯n⋆Kn(ξ)+DF¯n(ξ)(ξ^n−ξ)},Hn, 2(ξ)=n1/2{F^n(ξ^n)−F^n(ξ)−DF¯n(ξ)(ξ^n−ξ)}, we can rewrite the first and second terms as n1/2{F^n(ξ^n)−F^n(ξ)+F^n(ξ)−F¯n⋆Kn(ξ)}=Hn, 1(ξ)+Hn, 2(ξ)
.
Lemma 5.2.
Under the conditions
of Theorem 3.1, Hn, 1(ξ)
converges in
law to the normal distribution 𝒩(0,σ2) as n→∞, where σ2 is defined in (2.22).
Proof.
From the decomposition DF¯n(ξ)(ξ^n−ξ)=DF¯n(ξ)(ξ^n,⋅,1)+∑2≤p≤mDF¯n(ξ)(ξ^n,⋅,p), we can write Hn, 1(ξ)=n−1/2∑l=1nTn,l+n−1/2∑l=1nE(Kn(ξ−Xl))−n1/2(F¯n⋆Kn(ξ))+n−1/2∑l=1n∑2≤p≤mDF¯n(ξ)(ξ^l,⋅,p),
where Tn,l=Kn(ξ−Xl)−E(Kn(ξ−Xl))+DF¯n(ξ)(ζ^l,⋅, 1),ζ^l,⋅, 1=(mjζ^l,j, 1)1≤j≤d,ζ^l,j, 1=∫ℍlϕj,ℍld(δXl−Fl);1≤l≤n, and ϕj, ℍl is defined by
(2.3) for the component ϕj of Φ.
From the exponential asymptotic stationarity, n−1/2∑l=1nE(Kn(ξ−Xl))−n1/2(F¯n⋆Kn(ξ)) converges to zero, and from Lemma 5.1 and Markov inequality, we deduce that n−1/2∑l=1n∑2≤p≤mDF¯n(ξ)(ξ^l,⋅,p) converges to zero in probability.
It remains to show that n−1/2∑l=1nTn, l converges to a
normal distribution, noting that (Tn, l)1≤l≤n is a
nonstationary absolutely regular unbounded sequence of random variables which
verifies the mixing rate (1.2). To prove the asymptotic normality of n−1/2∑l=1nTn, l, we use the following lemma, obtained by Harel and
Puri [4, Lemma
2.3].
Lemma 5.3.
Let (Yn,l)1≤l≤n
be a
nonstationary absolutely regular unbounded sequence of random variables, which
verifies the mixing rate (1.2). Suppose that for any positive 𝒦, there exists a sequence (Yn,l𝒦)1≤l≤n
of random
variables satisfying (1.2) such that supn∈ℕ*max1≤l≤n|Yn,l𝒦|≤B𝒦<∞,𝒦>0, where B𝒦 is a positive
constant; supn∈ℕ*max1≤l≤nE|Yn,l−Yn,l𝒦|2+δ→0as𝒦→∞,1nE(∑l=1nYn,l)2→c2asn→∞, where c2 is a positive
constant; 1nE(∑l=1n(Yn,l𝒦−E(Yn,l𝒦)))2→c𝒦2asn→∞, where c𝒦2 is a positive
constant; c𝒦2→c2as𝒦→∞. Then n−1/2∑l=1nYn, l converges in
law to a normal distribution with mean zero and variance c2.
First, we prove (5.10) and (5.11) for the sequence (Tn, l)1≤l≤n.
Put Tn,l𝒦=Kn(ξ−Xl)−E(Kn(ξ−Xl))+DF¯n(ξ)(ζ^l,⋅, 1𝒦), where ζ^l,⋅, 1𝒦=(mjζ^l,j, 1𝒦)1≤j≤d,ζ^l,j, 1𝒦=∫ℍlϕj,ℍl𝒦d(δXl)−∫ℍlϕj,ℍldFl;1≤l≤n,ϕj,ℍl𝒦={ϕj,ℍlif|ϕj,ℍl|≤𝒦,0if|ϕj,ℍl|>𝒦.
From condition (v), we have supn∈ℕ∗max1≤l≤n|Tn,l𝒦|=1+∥m∥supn∈ℕ∗max1≤l≤nDF¯n(ξ)(𝒦+|ξ|)<∞, where ∥m∥=max1≤j≤d|mj|, and (5.10) is proved.
Now, by using the inequality |∑l=1dal|h≤2h(d−1)∑l=1d|al|h;h≥1,d≥2,
let ε>0 such that (2+δ)(1+ε)=2+2δ, we obtain
supn∈ℕ*max1≤l≤nE|Tn,l−Tn,l𝒦|2+δ=supn∈ℕ*max1≤l≤nE|DF¯n(ξ)(ζ^l,⋅, 1)−DF¯n(ξ)(ζ^l,⋅, 1𝒦)|2+δ=supn∈ℕ*max1≤l≤nE|DF¯n(ξ)((∫ℍlϕj,ℍl𝒦d(δXl−Fl))1≤j≤d)|2+δ≤d2(d−1)(2+δ)∥m∥2+δsupn∈ℕ*max1≤l≤n[DF¯n(ξ)(((∫|ϕj,ℍl|>𝒦|ϕj,ℍl|2+δdFl)1/(2+δ))1≤j≤d)]2+δ≤d2(d−1)(2+δ)∥m∥2+δsupn∈ℕ*max1≤l≤n1𝒦(2+δ)ε[DF¯n(ξ)(((∫ℍl|ϕj,ℍl|2+2δdFl)1/(2+δ))1≤j≤d)]2+δ=d2(d−1)(2+δ)1𝒦(2+δ)ε∥m∥2+δsupn∈ℕ*max1≤l≤n[DF¯n(ξ)(((∫ℍl|ϕj,ℍl|2+2δdFl)1/(2+δ))1≤j≤d)]2+δ.
From (3.2), we have
supn∈ℕ*max1≤l≤nE|Tn,l−Tn,l𝒦|2+δ≤2(d−1)(2+δ)1𝒦(2+δ)εM1,
where M1 is constant positive, which implies supn∈ℕ*max1≤l≤nE|Tn,l−Tn,l𝒦|2+δ→0as𝒦→∞, and (5.11) is proved.
We now show (5.12).
We first denote that 1nE(∑l=1nTn,l)2=1n∑l=1n∑k=1nE(Tn,lTn,k)=1n∑l=1nE(Tn,l2)+2n∑l=1n∑k=1n−1E(Tn,lTn,k)=1n∑l=1nφ(l,l)+1n∑l=1n∑k=1n−1φ(l,k),
where
φ(l,l)=∫ℍl{Kn(ξ−x)−E(Kn(ξ−Xl))+DF¯n(ξ)(ζ^l,⋅, 1)}2dFl(x),φ(l,k)=2∫ℍl⊕ℍk{Kn(ξ−x)−E(Kn(ξ−Xl))+DF¯n(ξ)(ζ^l,⋅, 1)}×{Kn(ξ−y)−E(Kn(ξ−Xk))+DF¯n(ξ)(ζ^k,⋅, 1)}dFℍl⊕ℍk(x,y),k>l.
It results that |1nE(∑l=1nTl,n)2−σ2|=|1n∑l=1nφ(l,l)+1n∑l=1n∑k=1n−1φ(l,k)−ρ(1)−∑l=1∞ρ(l)|≤1n∑l=1n|φ(l,l)−ρ(1)|+1n|∑l=1n∑k=1n−1φ(l,k)−∑l=1n(n−l)ρ(l)|+∑l=n+1∞|ρ(l)|+1n∑l=1nl|ρ(l)|=L1,n+|L2,n|+L3,n+L4,n,
where ρ(1)=E(A12)=∫ℍ{s(ξ−x)−F(ξ)+DF(ξ)(ξ1,⋅, 1*)}2dF(x),ρ(l)=2E(A1A1+l)=2∫ℍ1⊕ℍ1+l{s(ξ−X1*)−F(ξ)+DF(ξ)(ξ1,⋅, 1*)}×{s(ξ−X1+l*)−F(ξ)+DF(ξ)(ξ1+l,⋅, 1*)}dQσℍ1⊕ℍ1+l.
Thus to prove (5.12), we have to show that L1,n+|L2,n|+L3,n+L4,n→0asn→∞. From (2.14) and the convergence of Kn to the function s, L1, n→0 as n→∞, and by using [4, Lemma 3.1] of Harel and Puri, |L2,n|→0 as n→∞.
From the well-known inequality on moments of absolute
regular processes, we have, for δ′<δ/2, |ρ(l)|≤24{β(l)}δ′/(1+δ′)∥Al∥p2=24{β(l)}δ′/(1+δ′){E(Al)2+2δ′}1/(1+δ′),
where E(Al)2+2δ′=E(s(ξ−Xl*)−F(ξ)+DF(ξ)(ξl,⋅, 1*))2+2δ′≤22+2δ′{1+E|DF(ξ)(ξl,⋅, 1*)|2+2δ′}≤22+2δ′{1+E|DF(ξ)((∫ℍlϕj,ℍl∗d(δXl∗−F))1≤j≤d)|2+2δ′}≤22+2δ′{1+∥m∥2+2δ′E|DF(ξ)((∫ℍl|ϕj,ℍl∗|2+2δ′dF)1/(2+2δ′))1≤j≤d)|2+2δ′}.
From (3.3), we have E(Al)2+2δ′≤M21+δ′,say, where M2 is a finite
positive constant which implies |ρ(l)|≤24{β(l)}δ′/(1+δ′)M2, so that L3,n≤24M2n∑l=1n𝒪(l−(2+δ′)/(1+δ′))<∞,
then L3,n→0asn→∞.
We have L4,n≤24M2n∑l=1n𝒪(l−1/(1+δ′))→0asn→∞. We deduce that L4,n→0asn→∞.
Consequently, (5.12) is verified. Analogously, we show (5.13).
We put σ𝒦2=E(A1𝒦)2+2∑l=1∞E(A1𝒦Al+1𝒦),
where
Al𝒦=s(ξ−Xl∗)−F(ξ)+DF(ξ)(ς⋅, 1𝒦),ς⋅, 1𝒦=(mjςl,j, 1𝒦)1≤j≤d,ςl,j, 1𝒦=∫ℍlϕj∗,𝒦dδXl*−∫ℍlϕj∗dF,ϕj∗,𝒦={ϕj∗if|ϕj∗|≤𝒦,0if|ϕj∗|>𝒦, and ϕj∗ is defined by
(2.16) for the component ϕj of Φ.
By using the Lebesgue dominated convergence theorem,
we obtain
E(A1𝒦)2→E(A1)2,E(A1𝒦Al+1𝒦)→E(A1Al+1)as𝒦→∞, which implies that
lim𝒦→∞σ𝒦2=σ2,
which proves (5.14). Thus, assumptions (5.10) and (5.14) are satisfied, and n−1/2∑l=1nTn, l converges in
law to the normal distribution 𝒩(0,σ2). Consequently, Hn, 1(ξ) converges in
law to the normal distribution 𝒩(0,σ2). Therefore, Lemma 5.2 is proved.
Lemma 5.4.
Under the
conditions of Theorem 3.1, Hn, 2(ξ)
converges to
zero in probability.
Proof.
We write Hn, 2(ξ)=An+Bn,
where Bn=n1/2∫ℍ{F¯n(ξ^n−t)−F¯n(ξ−t)−DF¯n(ξ)(ξ^n−ξ)}dKn(t),An=∫ℍ{Vn(ξ^n−t)−Vn(ξ−t)}dKn(t),
with Vn(x)=n1/2{Fn(x)−F¯n(x)}. Of course Vn is a
multidimensional empirical process.
From [11, Theorem 3]by Yoshihara, we have ξ^n−ξ=𝒪p(bn), where bn={n−1log(log(n))}1/2.
Then, we deduce that |An|≤sup∥ξ^n−ξ∥≤Cbn|Vn(ξ^n−t)−Vn(ξ−t)|, where C is a positive
constant.
To prove that An→P0asn→∞, it suffices to show that
sup∥ξ^n−ξ∥≤Cbn|Vn(ξ^n−t)−Vn(ξ−t)|→P0asn→∞. Since F¯n is
differentiable, there exists θ in ]0, 1[ such that F¯n(y)−F¯n(x)=DF¯n(x+θ(y−x))⋅(y−x). The differential DF¯n being bounded,
there exists a positive constant M3 such that |F¯n(y)−F¯n(x)|≤M3∥y−x∥, which implies that sup∥y−x∥≤Cbn|Vn(y)−Vn(x)|≤sup∥F¯n(y)−F¯n(x)∥≤CM3bn|Vn(y)−Vn(x)|.
We have Vn(x)=n−1/2∑l=1n{𝕀[Fl(Xl)≤u]−Gn(u)}=Wn(u),
where Gn is the copula
of F¯n and 𝕀[⋅] denotes the
indicator function.
We have |An|≤sup∥u−v∥≤CM3bn|Wn(u)−Wn(v)|=ω(Wn,ηn),
with ηn=CM3bn→0asn→∞, where ω is the module
of continuity for any bounded function f:[0, 1]d→ℝ+ defined by ω(f,η)=sup{|f(u)−f(v)|;u,v∈[0,1]d,∥u−v∥≤η},η>0. We generalize [2, Relation (3.11)] by Sun from the univariate case to the multivariate case by
using similar methods as in [14, Lemmas (6.3) and (6.5)] by Harel and Puri. Therefore, we get limn→∞supP{ω(Wn,ηn)≥ε}≤ε;∀ε>0. It results from the inequalities (5.51) and (5.54) that An converges to
zero in probability as n→∞.
For the second term of Hn, 2 and using the
Lagrange form of Taylor theorem, applied to F¯n on the points ξ^n−t and ξ−t until the
second order, there exists θ′ such that F¯n(ξ^n−t)−F¯n(ξ−t)=DF¯n(ξ−t)(ξ^n−ξ)+12D2F¯n(zθ′)(ξ^n−ξ,ξ^n−ξ), with zθ′=ξ−t+θ′(ξ^n−ξ),0<θ′<1.
In the same way, there exists θ′′ such that DF¯n(ξ−t)(ξ^n−ξ)−DF¯n(ξ)(ξ^n−ξ)=D2F¯n(zθ′′)(ξ^n−ξ,−t), with zθ′′=ξ−θ′′t,0<θ′′<1,
|Bn|=|n1/2∫ℍ{D2F¯n(zθ′′)(ξ^n−ξ,−t)+12D2F¯n(zθ′)(ξ^n−ξ,ξ^n−ξ)}dKn(t)|≤|n1/2D2F¯n(zθ′′)(ξ^n−ξ, 1)|∫ℍ∥t∥dKn(t)+|n1/22D2F¯n(zθ′)(ξ^n−ξ,ξ^n−ξ)|∫ℍdKn(t).
From Harel and Puri [5], we deduce that n1/2{ξ^n−ξ} converges to a
multinormal distribution as n→∞.
From Condition (iv), we have
∫ℍ∥t∥dKn(t)=∫ℍ∥t∥an−dk(t)dt=an−d∫ℍ∥t∥k(t)dt→0asn→∞,
which implies Bn→P0asn→∞.
Consequently, we deduce that Hn, 2 converges to zero in probability, and Lemma 5.4 is proved.
Therefore, the proof of Theorem 3.1 follows from Lemmas
5.2 and 5.4.
PuriM. L.RalescuS. S.Central limit theorem for perturbed empirical distribution functions evaluated at a random point198619227327910.1016/0047-259X(86)90032-1MR853058ZBL0594.62056SunS.Asymptotic behavior of the perturbed empirical distribution functions evaluated at a random point for absolutely regular sequences199347223024910.1006/jmva.1993.1081MR1247376ZBL0809.62039SunS.Perturbed empirical distribution functions and quantiles under dependence19958476377710.1007/BF02410110MR1353552ZBL0862.60019HarelM.PuriM. L.Limiting behavior of U-statistics, V-statistics, and one sample rank order statistics for nonstationary absolutely regular processes198930218120410.1016/0047-259X(89)90034-1MR1015367ZBL0683.60007HarelM.PuriM. L.Weak invariance of generalized U-statistics for nonstationary absolutely regular processes199034234136010.1016/0304-4149(90)90022-KMR1047650ZBL0701.60025DucharmeG. R.Mint El MouvidM.Almost sure convergence of the local linear estimator of the conditional cumulative distribution function20013339873876MR1873227ZBL0987.62034OodairaH.YoshiharaK.-I.The law of the iterated logarithm for stationary processes satisfying mixing conditions197123311334MR0307311ZBL0234.60034HarelM.PuriM. L.Nonparametric density estimators based on nonstationary absolutely regular random sequences19969323325410.1155/S1048953396000238MR1409729ZBL0897.62036SunS.ChiangC.-Y.Limiting behaviour of the perturbed empirical distribution functions evaluated at U-statistics for strongly mixing sequences of random variables199710132010.1155/S1048953397000026MR1437948ZBL0873.62050BillingsleyP.1968New York, NY, USAJohn Wiley & Sonsxii+253MR0233396ZBL0172.21201YoshiharaK.-I.Limiting behavior of U-statistics for stationary, absolutely regular processes1976353237252MR0418179ZBL0314.60028DenkerM.KellerG.On U-statistics and V. Mises' statistics for weakly dependent processes1983644505522MR717756ZBL0519.60028HoeffdingW.A class of statistics with asymptotically normal distribution194819293325MR0026294ZBL0032.04101HarelM.PuriM. L.Conditional empirical processes defined by nonstationary absolutely regular sequences199970225028510.1006/jmva.1999.1822MR1711518ZBL0952.62035