We present sufficient conditions under which the sequence of arithmetic means Sn/n, where Sn=X1+⋯+Xn, is the partial sum built on a stationary sequence {Xn}n≥1 of associated integer-valued and uniformly bounded random variables, which satisfy the large deviation principle.
1. Introduction and the Notation
Let {Xn}n≥1 be a strictly stationary sequence of square-integrable random variables (r.v.’s). Denote Sn=∑i=1nXi, μ=EX1, and σ2=VarX1+2∑i=2∞Cov(X1,Xi). Consider a sequence satisfying the strong law of large numbers (SLLN) and the central limit theorem (CLT), that is, Sn-nμ/σn→dN(0,1). SLLN asserts that the arithmetic mean Sn/n converges to μ almost surely when n→∞, whereas the CLT specifies the probability that Sn differs from μn by the quantity of order n. Such deviation is called normal. However, in the case of Sn and μn being distant by the quantity of order n, we deal with so-called large deviation events.
It turns out that under certain condition on the tail distribution function of Xi, for a>0, the probability P(Sn≥(μ+a)n) tends to zero with exponential rate; that is,
(1)limn→∞1nlnPSnn≥μ+a=-I(a)<0,n⟶∞,
where I(a) is called the rate function.
In the i.i.d. case, the well-known Cramer theorem gives the explicit formula for the rate function (see, for example, [1]). Let us recall that if the moment generating function φ(t)=EetX1 is finite for all t∈R, then limn→∞(1/n)lnPSn≥an=-I(a) for all a>μ, where
(2)I(z)=supt∈Rzt-lnφt,
which means that I(z) is the Fenchel-Legendre transform of the function lnφ(t).
In general setting, we follow [1] to give the definition of the rate function as the function defined on a Polish space X, taking values in closed half line [0,∞] and satisfying three conditions:
I is not identically infinite;
I is lower semicontinuous;
I-1([0,c]) is compact for all c>0.
Let us further recall that it is said that the sequence of probability measures {Pn}n≥1 satisfies the large deviation principle (LDP) with rate n and the rate function I, if
I is the rate function in the sense of the invoked definition;
limsupn→∞(1/n)lnPnF≤-infx∈FIx, for all closed F⊂R;
liminfn→∞(1/n)lnPnG≥-infx∈GIx, for all open G⊂R.
It is also well known that the rate function corresponding to the sequence {Pn}n≥1 is uniquely determined.
The role of open and closed sets in the above definition is similar to the one they play in Portmanteau theorem stating equivalent conditions for the weak convergence of probability measures. In fact, LDP may be thought of as the analogue of the weak convergence but in the exponential scale. Bryc [2] proved that if the sequence of probability measures {Pn}n≥1 is exponentially tight (see Appendix) and for all continuous and bounded functions F defined on the Polish space X (denoted by CB(X)) the limit
(3)R∋Λ(F)∶=limn→∞1nln∫XenF(x)Pn(dx)
exists, then {Pn}n≥1 satisfies LDP with rate n and the rate function
(4)Ix=supF∈CBXFx-ΛF,x∈X.
This result explains the relation between the LDP and the concept of weak convergence of probability measures. Moreover, it appears very useful in proving the LDP, since it suffices to show existence of Λ(F) for sufficiently large subfamily of continuous and bounded functions.
There is only a few results on LDP for dependent r.v.’s other than Markov processes. Bryc [3] and Bryc and Dembo [4] studied LDP for strongly mixing sequences of r.v.’s; Henriques and Oliveira [5, 6] dealt with stationary sequences of associated absolutely continuous r.v.’s. They assumed that the probability density function fSn/n of the random variable Sn/n satisfies the following condition:
(5)fSn/nx≤a·Bn,forsomea>0,B>1.
Our goal is to extend the results of Henriques and Oliveira [5, 6] by proving the LDP for a stationary sequence {Xn}n≥1 of integer-valued r.v.’s satisfying the following additional conditions:
Xi, i≥1, are associated;
Xi, i≥1, are uniformly bounded; that is, there exists M>0 such that P(Xi≤M)=1;
u(n)∶=∑j=n+1∞Cov(X1,Xj)≤a0·exp(-nln1+bn) for some a0,b>0.
For the definition of associated r.v.’s and their properties we refer the reader to the monographs of Bulinski and Shashkin [7], Oliveira [6], and Rao [8].
The paper is organized as follows. Section 2 presents the LDP in question, its proof, and an example demonstrating the applicability of the result. For convenience while reading, we place the technical lemmas used in the proof in Section 3. At the end, in the Appendix, we recall the Gärtner-Ellis theorem and essential results of Varadhan [9] and Bryc [2] as well as two lemmas on convergence of real sequences. In the proofs we will follow the ideas of Henriques and Oliveira [5, 6].
2. Main ResultTheorem 1.
Let {Xn}n≥1 be a strictly stationary sequence of integer-valued r.v.’s satisfying conditions (A1), (A2), and (A3). Then the sequence of probability measures {Pn(·)=PSn/n∈·}n≥1 satisfies the large deviation principle with rate n and the rate function Λ*(x) being Fenchel-Legendre transform of the function
(6)Λ(t)=limn→∞1nlnEetSn,
which means that
(7)Λ*x=supt∈Rxt-Λt,x∈R.
Proof.
The main tool in proving LDP for dependent r.v.’s is the Gärtner-Ellis theorem (see Appendix) which states that it is enough to verify the existence of limit (6) together with the differentiability of Λ to have LDP.
To be more precise, in order to obtain the upper bound (A.1), we need to prove that limit (6) exists, which is shown in Lemma 3 (see Section 3). Actually, we prove even more—the finiteness of (6) for all t∈R.
With a view to getting the lower bound (A.2), we first need to verify the existence of a more general limit
(8)Λ(G)=limn→∞1nlnEenGSn/n
for any real continuous, concave, and bounded from above function G (the class of such functions will be denoted by CCBA(R)). It is presented in Lemma 4, Section 3. In fact, these are continuous and bounded functions (CB(R)) which is required in limit (A.5). Nevertheless, in order to claim the sole existence of this limit, it suffices to consider the subfamily CCBA(R), since it is well separated (see Definition B.7 and Theorem B.8 in [6]).
Further, from (A2), it is easy to see that the distributions of Sn/n are exponentially tight (see Appendix).
As a result, by Lemma A.4, we can claim that the LDP holds with rate n and the rate function
(9)I(x)=supF∈CBRFx-ΛF.
However, to make sure that I is the Fenchel-Legendre transform of Λ(t) defined by (6), we are still in need of showing convexity of I (see Lemma A.2). To this end, somewhat unobvious implication is inevitable. If x1,x2∈R are such that, for all δ>0, liminfn→∞(1/n)lnPSn/n-xi<δ>-∞ for i∈{1,2}, then
(10)infδ>0liminfn→∞1nlnPS2n/2n-x1+x2/2<δPSn/n-x1<δ/2PSn/n-x2<δ/2≥0.
It is shown in Lemma 6.
We are already in a position to prove the convexity of I, proceeding exactly like in the proof of Theorem 3.20 in [6]. According to Theorem B.2 in [6], the rate function I may be presented in the following form:
(11)Ix=-infδ>0,y:|y-x|<δliminfn→∞1nlnPSnn-y<δ=-infδ>0,y:|y-x|<δlimsupn→∞1nlnPSnn-y<δ.
Since for y∈(x-δ,x+δ) there exists δ′ such that (x-δ′,x+δ′) is immersed in (y-δ,y+δ), it is enough to write
(12)Ix=-infδ>0liminfn→∞1nlnPSnn-x<δ=-infδ>0limsupn→∞1nlnPSnn-x<δ.
Let us now take x1,x2∈R such that I(x1),I(x2)<∞. As the assumption of Lemma 6 is satisfied, we have
(13)infδ>0liminfn→∞1nlnPS2n/2n-x1+x2/2<δPSn/n-x1<δ/2PSn/n-x2<δ/2≥0.
Hence,
(14)-Ix1+x22=infδ>0limsupn→∞1nlnPSnn-x1+x22<δ≥infδ>0liminfn→∞12nlnPS2n2n-x1+x22<δ×PSnn-x2<δ2-1×PSnn-x1<δ2×PSnn-x2<δ2-1+infδ>0liminfn→∞12nlnPSnn-x1<δ2+infδ>0liminfn→∞12nlnPSnn-x2<δ2≥-12I(x1)-12I(x2),
which means that I is midconvex (called by some authors Jensen-convex or J-convex). The function I is measurable; thus according to Sierpiński Theorem (see [10], Theorem 9.4.2. and Theorem 5.3.5), it is convex and the proof is completed.
Finally, let us present an example of a sequence satisfying the assumptions of our theorem; for this sequence obviously the results of [5] do not apply.
Example 2.
Let Ynn≥1 be a Gaussian sequence with the squared exponential covariance function; that is,
(15)CovYi,Yj=exp-i-j2.
This sequence is stationary and associated (positively correlated Gaussian). Define a sequence Xnn≥1 as follows:
(16)Xn=I(Yn≤x),
where x∈R is an arbitrary number. The r.v.’s Xnn≥1 inherit the properties of association and stationarity from the sequence Ynn≥1 (we applied the same nonincreasing indicator function to the r.v.’s Xn which are associated). Furthermore, for n≥1, from the covariance inequality for Gaussian r.v.’s (see [11]) we have
(17)0<CovX1,Xn+1=PY1≤x,Yn+1≤x-PY1≤xPYn+1≤x≤4fY1∞2CovY1,Yn+1=4fY1∞2exp-n2.
Therefore
(18)u(n)=∑j=n+1∞CovX1,Xj≤const·exp(-n2)
and the binary sequence Xnn≥1 is stationary and fulfills assumptions (A1), (A2), and (A3).
3. Auxiliary ResultsLemma 3.
Let {Xn}n≥1 be the sequence as in Theorem 1. Then limit (6) exists and is finite for all t∈R.
Proof.
The proof is nearly rewritten from Theorem 3.16 in [6]. For n,m∈N, t∈R, by assumption (A1), we have CovetSn,et(Sn+m-Sn)≥0. Since the sequence of r.v.’s Xi, i∈N, is stationary, we obtain lnEetSn+m≥lnEetSn+lnEetSm. Denoting h(n)=-lnEetSn, by Lemma A.5 we get the existence of (6). It remains to verify its finiteness for all t∈R. By assumption (A2), we have
(19)e-tnM≤EetSn≤etnMfort>0,etnM≤EetSn≤e-tnMfort<0.
Thus, for arbitrarily chosen t∈R, limn→∞(1/n)lnEetSn is finite.
Lemma 4.
Let {Xn}n≥1 be the sequence as in Theorem 1. Then, for every real continuous, concave, and bounded from above function G:R→R, the limit Λ(G)=limn→∞(1/n)lnEenGSn/n exists.
Proof.
Except for the steps where discreteness of r.v.’s in question is involved, the proof goes along exactly like in Theorem 3.18 in [6].
Without loss of generality, we may and do assume that -∞<-B≤G(x)≤0 for x∈[-M,M] and for some B>0. Since G is continuous and concave, it is a Lipschitz function with constant L, for example. By assumption (A2), we get that for n,m,l∈N(20)1n+mSn+m-Sn+Sn+m+l-Sn+l≤2lMn+m,almostsurely.
Thus, since G is Lipschitz,
(21)G1n+mSn+m-G1n+mSn+Sn+m+l-Sn+l≥-2lLMn+m.
By assumed concavity of G, this implies that
(22)G1n+mSn+m≥nn+mGSnn+mn+mGSn+m+l-Sn+lm-2lLMn+m,
which is equivalent to
(23)lnEen+mG(Sn+m/(n+m))≥lnEenGSn/nemG((Sn+m+l-Sn+l)/m)-2lLM.
When we denote h(n)∶=-lnEenG(Sn/n), the above inequality takes the following form:
(24)h(n+m)≤2lLM-lnEenG(Sn/n)emG((Sn+m+l-Sn+l)/m).
We will now aim at finding the bound for the second term at the r.h.s. of inequality (24).
Let us now define, for x∈[-M,M] and n∈N, the sequence of continuous functions Fn:R→R in the following way:
(25)Fnx=enG(x),forx=kn,wherek∈Zlinear,otherwise.
Each Fn is thus a polyline with vertices having first coordinate of the form k/n, k∈Z, and second coordinate in the interval (0,1]. Now, since G is nonpositive and Lipschitz with constant L, we have
(26)enG(x)-enG(y)≤nLx-yforx,y∈[-M,M],
so Fn is absolutely continuous (since it is Lipschitz continuous) and almost everywhere differentiable with |Fn′(x)|≤Ln.
We will now make use of the Newman identity (see [12]) which allows us to express the covariance of two absolutely continuous functions of arbitrary r.v.’s via the covariance of the indicators of these r.v.’s. Let us recall that if g1 and g2 are absolutely continuous functions and X, Y are random variables, such that g1(X) and g2(Y) are square-integrable, then
(27)Covg1X,g2Y=∫-∞+∞∫-∞+∞g1′tg2′sCovIX≤t,IY≤sdtds.
In light of the above identity, we can write
(28)CovenG(Sn/n),emG((Sn+m+l-Sn+l)/m)=∫-MM∫-MMenGxnG′mGymG′y×CovISnn≤x,ISn+m+l-Sn+lm≤ydxdy≤L2Cov(Sn,Sn+m+l-Sn+l),
where the last inequality is a consequence of nonpositivity of G and application of the well-known Hoeffding identity stating that
(29)CovX,Y=∫∫R2P(X≤x,Y≤y)-PX≤xPY≤ydxdy.
By the assumption of stationarity of {Xn}n∈N, we can bound the above expression in the following way:
(30)L2CovSn,Sn+m+l-Sn+l≤L2n+m∑i=ll+mCovX1,X1+i=L2(n+m)u(l).
Hence, we get
(31)EenG(Sn/n)emG((Sn+m+l-Sn+l)/m)≥EenG(Sn/n)EemG((Sn+m+l-Sn+l)/m)-L2(n+m)u(l),
which, on the basis of uniform boundedness of Xi, i∈N, yields
(32)EenG(Sn/n)emG((Sn+m+l-Sn+l)/m)EenG(Sn/n)EemG((Sn+m+l-Sn+l)/m)≥1-L2(n+m)u(l)e(n+m)B.
Let us now define the following quantity: θ(l,n)∶=1-L2nu(l)enB, and restate (32):
(33)lnEenG(Sn/n)emG((Sn+m+l-Sn+l)/m)≥-h(n)-h(m)+lnmax0,θl,n+m.
Going back to inequality (24), we obtain
(34)hn+m≤2lLM+hn+hm-lnmax0,θl,n+m.
Choose δ∈(0,b), where b is the constant taken from assumption (A3). By Lemma A.6,
(35)θnln1+δn,n⟶1,n⟶∞.
Therefore it is easy to show that putting l=(n+m)/ln1+δ(n+m), for sufficiently large n+m, we have
(36)lnmax0,θl,n+m≥-l.
As a result, from inequality (34), we arrive at
(37)hn+m≤2lLM+hn+hm+l=h(n)+h(m)+(2lLM+1)n+mln1+δ(n+m),
and by Lemma A.5 we conclude that the limit limn→∞(hn/n) exists and the proof is completed.
Lemma 5.
Let X and Y be integer-valued, associated, square-integrable r.v.’s; then
(38)P(a1≤X≤a2,b1≤Y≤b2)-P(a1≤X≤a2)P(b1≤Y≤b2)≤4
Cov
(X,Y),
for any a1<a2 and b1<b2.
Proof.
The proof follows from the covariance inequality
(39)supx,y∈RPX≤x,Y≤y-PX≤xPY≤y≤Cov(X,Y)
obtained in [13] for integer-valued associated r.v.’s X, Y.
Lemma 6.
Let {Xn}n≥1 be the sequence as in Theorem 1. Let also x1,x2∈R be such that, for all δ>0,
(40)liminfn→∞1nlnPSnn-xi<δ>-∞fori∈{1,2}.
Then,
(41)infδ>0liminfn→∞1nlnPS2n/2n-x1+x2/2<δPSn/n-x1<δ/2PSn/n-x2<δ/2≥0.
Proof.
Again, apart from the calculations we conduct with remark that the r.v.’s are integer-valued, the proof goes like in Theorem 3.19 in [6].
From assumption (40) we know that there exist ci>0, i∈{1,2}, such that, for sufficiently large n, (1/n)lnPSn/n-xi<δ>-ci; thus, for some c>0,
(42)PSnn-x1<δPSnn-x2<δ≥e-nc.
Now, for n,l∈N, by Lemma 5 and by stationarity of r.v’s, we can write
(43)PSnn-x1<δ2,S2n+l-Sn+ln-x2<δ2-PSnn-x1<δ2PSnn-x2<δ2≤4Cov(Sn,S2n+l-Sn+l)≤4nu(l).
Hence and in light of inequality (42), we get
(44)PSn/n-x1<δ/2,S2n+l-Sn+l/n-x2<δ/2PSn/n-x1<δ/2PSn/n-x2<δ/2≥1-4nu(l)enc.
Next, on the basis of (A2) we see that
(45)S2n2n-x1+x22≤12Snn-x1+S2n+l-Sn+ln-x2+lMn,
which yields that
(46)PS2n2n-x1+x22<δ≥P12Snn-x1+S2n+l-Sn+ln-x2<δ-lMn.
By triangle inequality, this implies
(47)PS2n2n-x1+x22<δ≥PSnn-x1<δ-lMn,S2n+l-Sn+ln-x2<δ-lMn.
Putting l=δn/2M in the above inequality and plugging it in inequality (44), we arrive at
(48)PS2n/2n-x1+x2/2<δPSn/n-x1<δ/2PSn/n-x2<δ/2≥1-4nuδn2Menc.
If we finally put δ=2M/ln1+kn, where 0<k<b, then by Lemma A.6 we get the thesis.
Appendix
We recall the Gärtner-Ellis theorem invoked throughout the proof of the main result.
Theorem A.1 (Gärtner-Ellis [14, 15]).
Assume that, for every t∈R, Λ(t) defined by (6) exists. Then, its Fenchel-Legendre transform Λ*(x) satisfies the following:
for every closed F⊂R(A.1)limsupn→∞1nlnPSnn∈F≤-infx∈FΛ*(x);
if Λ is differentiable, then for every open G⊂R(A.2)liminfn→∞1nlnPSnn∈G≥-infx∈GΛ*(x).
The next two theorems we lean against in the proof are Varadhan’s Lemma (necessary condition for LDP) and Bryc’s result (sufficient conditions for LDP).
Lemma A.2 (Varadhan [9]).
Assume that the large deviation principle is satisfied with a rate function I. Moreover, assume that for every t∈R(A.3)Λ¯(t)=limsupn→∞1nlnEetSn<∞.
Then
for every t∈R the limit above exists and is the Fenchel-Legendre transform of I(x); that is, Λ¯(t)=supx∈R[tx-I(x)];
if the rate function I is convex, then it is the Fenchel-Legendre transform of Λ¯; that is, I(x)=supt∈R[xt-Λ¯(t)].
In order to formulate the next lemma let us recall the notion of exponential tightness of a sequence of probability measures.
Definition A.3.
It is said that the sequence of probability measures {Pn}n≥1 is exponentially tight, if for every ɛ>0 there exists a compact set Kɛ⊂R such that
(A.4)limsupn→∞1nlnPnR∖Kɛ≤-ɛ.
Lemma A.4 (Bryc [2]).
Assume that Pn, the distributions of Sn/n, are exponentially tight and that, for every continuous and bounded function F, the following limit exists:
(A.5)Λ(F)=limn→∞1nlnEenF(Sn/n).
Then the probability measures Pn(·)=PSn/n∈·, n∈N, satisfy the large deviation principle with rate n and the rate function I(x)=sup(F(x)-Λ(F)), where the sup is taken over the family of bounded and continuous functions. Moreover, for every such F,
(A.6)Λ(F)=supx∈R[F(x)-I(x)].
Finally, we invoke two lemmas concerning the existence of the limit of specific real sequences. Both of them are taken from [6].
Lemma A.5.
Let {un}n≥1 and {ϵn}n≥1 be real sequences such that un+m≤un+um+ϵn+m, and for some δ>1, limsupn→∞(ϵn/n)lnδn<∞. Then the limit u¯=limn→∞(un/n) exists.
Lemma A.6.
Let {t(n)}n≥1 be a real sequence of nonnegative numbers such that there exist a,b>0 satisfying
(A.7)t(n)≤a·e-nln1+bn.
Then, for all k<b and c∈R,
(A.8)limn→∞n·tnln1+knecn=0.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
HollanderF.2000Fields Institute MonographsBrycW.Large deviations by the asymptotic value method19901Boston, Mass, USABirkhäuser447472Progress in Probability no. 22BrycW.On large deviations for uniformly strong mixing sequences199241219120210.1016/0304-4149(92)90120-fMR11641732-s2.0-38249011837BrycW.DemboA.Large deviations and strong mixing1996324549569MR1411271HenriquesC.OliveiraP. E.Large deviations for the empirical mean of associated random variables200878659459810.1016/j.spl.2007.09.020MR24095222-s2.0-40749087702OliveiraP. E.2012New York, NY, USASpringer10.1007/978-3-642-25532-8MR3013874BulinskiA.ShashkinA.200710SingaporeWorld ScientificAdvanced Series on Statistical Science and Applied ProbabilityRaoB. L. S. P.2012BirkhäuserVaradhanS. R. S.Asymptotic probabilities and differential equations196619326128610.1002/cpa.3160190303MR0203230KuczmaM.20092ndBasel, SwitzerlandBirkhäuserMatulaP.A note some inequalities for certain classes of positively dependent random variables20042411726NewmanC. M.TongY. L.Asymptotic independence and limit theorems for positively and negatively dependent random variables1984Hayward, Calif, USAInstitute of Mathematical StatisticsMatulaP.On some inequalities for positively and negatively dependent random variables with applications2003634511521GärtnerJ.On large deviations from the invariant measure19772212439MR047104010.1137/1122003EllisR. S.Large deviations for a general class of random vectors1984121112MR72372610.1214/aop/1176993370