BUSY PERIOD ANALYSIS , RARE EVENTS AND TRANSIENT BEHAVIOR IN FLUID FLOW MODELS

We consider a process {(Jt, Vt)}t >0 on Ex[0, o), such that {Jr} is a Markov process with finite state space-E, and {Vt} has a linear drift r on intervals where Jtand reflection at 0. Such a process arises as a fluid flow model of current interest in telecommunications engineering for the purpose of modeling ATM technology. We compute the mean of the busy period and related first passage times, show that the probability of buffer overflow within a busy cycle is approximately exponential, and give conditioned limit theorems for the busy cycle with implications for quick simulation. Further, various inequalities and approximations for transient behavior are given. Also explicit expressions for the Laplace transform of the busy period are found. Mathematically, the key tool is first passage probabilities and exponential change of measure for Markov additive processes.


Introduction
Fluid flow processes can be seen as a class of applied probability models which in many ways is parallel to queues.Frown an application point of view, the historical origin is in both cases performance evaluation in telecommunication, with the difference being motivated in the change of technology: from switchboards in the days of Erlang to modern ATM (asynchronous transfer mode) devices.Mathematically, both class of models have fundamental relations to random walks and more general additive processes.For queues, the classical example is the reflected random walk representation of the waiting time via the Lindley recursion ([5], Ch.III.7- 8).More recently, the use of Markov-modulation for modeling bursty traffic has led into more general Markov additive processes (see e.g.[6], [7]) which are also the key tool we use for studying fluid flow nodels, by representing them as reflected versions of finite Markov additive processes with the additive component having the simplest possible structure of a p,re linear drift.
Most of the applied literature deals with the computation of the sl, eady-sl, al, e distribution.Section 7 deals with transient behavior, more precisely the study of P(V T > u).We show that for large u and T, a certain time epoch of the form T u/' (7) (with n and 7 defined in the body of the paper) plays a crucial role as the time at which (V T > u) approximately attains its stationary value P(V > u) (which in turn is approximately proportional to e-'u).For T<< u/g (7), we determine the approximate form of (V T > u), and for T>> u/g' (7), we evaluate the difference (V :> u)-(V T > u).Further results give a central limit estimate of (V T > u) when T is only moderately different from u/g' (7), and an estimate of the rate of convergence P(U T > u)--,(V > u) when u is fixed and only T--,c.
Whereas most results of the paper are inequalities or approximation_s,0/ction 8 contains a variety of exact results.In particular, we find the Laplace transform Eie of the busy period and the related time 7_(u) the system needs to empty from a large level u.However, the expressions involve a functional inversion and may appear too complicated to be useful for computational purposes (in fact, it does not seem not straightforward just to differentiate to derive the mean of Pi or r_ (u)).Nevertheless, [Vr_ (u) can be evaluated exactly.
Section 2 gives the preliminaries and a summary of the most relevant result from the literature.In particular, some basic matrices occurring in the steady-state solution are intro- duced; they are of basic importance in the present paper as well, since the computational eval- uation of the busy period/transient behavior results turns out to require either just these matrices, or matrices of just the same form but defined via duality in terms of time reversion, sign rever- sion or change of parameters.In Section 4, we introduce the basic technique used in most of the paper, change of measure via exponential families.In fact, some of the results show that this is not only a convenient mathematical tool but that the process in certain situations will behave precisely as if the parameters were changed in this way.In particular, Section 6 gives a precise description of this type of process behavior prior to exceedance of a large level in a busy cycle, a result which also determines the optimal change of measure in rare events simulation.
The results of the paper are exemplified via a simple two-state model in Section 9; this example may be read before the body of the paper to get a first impression of the flavor of the results.The Appendix contains two proofs deferred to there.
We finally mention that, though not developed in detail, most of the analysis of the present paper carries over to fluid models with Brownian noise which have received some recent attention, see in particular Gaver & Lehoczky [20], Kennedy & Williams [25], Asmussen [8], Rogers [35] and Karandikar & Kulkarni [28].This means that on intervals where Yt i, {St} evolves as a 2 depending on i.In some cases, the Brownian motion with drift r and variance constant r formulations have, however, to be slightly changed.In particular, the above definition of a busy period becomes trivial (Pi 0), so that instead one has to start the busy period at x > 0.

Prehminaries
As in [81, we represent {Vt} as the reflected version of the net input process V S rain S (2.1) In particular, this means that {St} is a continuous Markov additive process defined on an irreducible Markov jump process {Jr} with a finite state space E (see e.g.Qinlar [1.6]).
An illustration of the connection between {Vt} and {St} is given in Figure 2.1.This figure shows also another fundamental tool of the paper (as well as of [8] and papers like [11], [35], [25]), two Markov processes x>O +(x) x>O which are obtained by observing {J} when {S} is at a minimum or maximum.Here 7-+ (x) inf{t > O'S-x}, 7 (x) inf{t > O'S x} are the first passage times to levels x > 0, resp.x < 0.
natural way such that The slopes r arc ordered the r$ > r >O>ro >r$.{y,} Let It-(,kij)i,j e E denote the intensity matrix for {Jt} and write for brevity Ai--Aii for the rate parameter of the exponential holding time in state i.We let T denote the matrix with ijth element lij/]ril.Using the convention that for a given E-, E+or E_ -vector s-(si) A s denotes the diagonal matrix with the s on the diagonal, we can write T-h rrll We shall also use block-partitioned notation like h(++ h(+-) ) T + T When we write say A r-l/t_( q-), the convention is that dimensions should match.I.e., A r is E+ x E+ with the r i, E E+, on the diagonal.Similarly, the identity matrix I, the ith unit column vector e and the column vector e with all entries equal to 1 may have indices in E, E + or E_ depending on the context.
For later reference, we quote also the Wiener-Hopf factorization identity ([11]  We now introduce the time-reversed version {Jr, St} of the Markov additive process {Jt, St}. A'A= (note that We can write h (i.e. the matrix with elements Aij rjji/ri) as h A/r A --A and r -ri).Thus {Jt} is defined in terms of h rather than A, and {St} is defined as {St} with the same rates r but {Jr} replaced by {Jr}" Further let M~(t)-suPo< <tSu S _u_ M~s uPt > oSt s Proposition 2.1 ([8])" j(Jt i, V E A) -#-Pi(Jt j,M (t) A), P(V A,Ji)-rii(M A).Notation like refers in an obvious way to {t)" In particular, since clearly {M (t) > x} {Y+ (x) _< t), Proposition 2.1 yields ri (t-J,Y (x)<t), Recall that a distribution F on [0, oc) is phase-type with phase generator U and initial vector a if F is the distribution of the lifetime of a Markov process which has initial distribution a and intensity matrix U, cf.[33]; if the mass ae of a is less than one, we adapt the convention that this corresponds to an atom of size 1-ae in 0. From the above discussion, it follows immediately that (as shown in [8]) the distribution of the steady-state variable (J, V) is phase-type given J-i, with phase generator (+) and initial vector !-+) for iE_ and efor iE+.More precisely, for E E_ e(v > .,Ji) u!. + x__,fl e(V 0, J i) ri(1 -U!. + -)e); for E +, just replace + by e i. (2.9) (2.10)

The Mean Busy Period
Let P be the matrix with ijth entry #i Ei[Pi; JP.-J]; then Pe is the vector with ith entry [ViP i.We shall show that once a + has been evaluated, the entries of and e can be computed as the solution of linear equations.We start with the case of P e, which may be worthwhile treating separately because we get matrices of lower dimension than as for P.
Proof: We use a decomposition of the path {St} 0 > < P. as indicated in  Here w inf{t > O'Jt E E and }, so that w is phase-type with representation (h + +),e) w.r.t.Pi
The post-w path can be split up into two types of intervals, the first being intervals where {St}t > 0 is a relative minimum and the second being sub-busy cycles (two on Figure 3.1; marked by bold lines on the time axis).Lct the total lengths of intervals of the two types be wl.w 2. If Jj,S-x, the values of {Jr} observed on the Wl-segment are distributed as {J-(-u)}0 < u< starting from Jr_ (0) J" Hence, the expected time in state k E E on the Wl-segment is x [Fl;j,k(x e,e U(-)ye.
k. Irk 'dy" 0 In particular, using (3.1) and (2.4), we get EiCl; Jw, k(Sw) Now when Jt-k on the Wl-segment a sub-busy period of type t E E+ Hence occurs at rate $kl" Noting that EiP -[i(w + CO -]-02) collecting terms and rewriting in matrix notation, the result follows by easy algebra.
V! Pmark 3.2: In [9], a somewhat similar argument is carried out in branching process language.As was kindly pointed out by Dr. S. Grishechkin, the process in question is not a branching process in the strict sense (some of the required independencies fail).However, the argument for expected values is correct.

V1
Now consider the more general case of P.
Proofi We distinguish between the possibilities that {Jr} has a state transition in [O, dt/ri), to k (say), or not.If k E in the first case, the busy period will terminate within time O(dt), so that the contribution from this is O((dt)2) 0. If k E +, a sub-busy period starts from k and coincides with Pi up to O(dt) terms.
transitions in [O, dt/ri) is Thus, the total contribution to ij from state AI= kE+,k#i In the second case, there are three contributions: the one A 2 from the initial segment; the one A 3 from the sub-busy period starting from at time dt/r in level Vdt/r :dt and ending at the next downcrossing of level dr; and the one A r from the final segment ater this downcrossing.
The length of the initial segment is dt/ri, and up to O((dt)2) terms, it provides a contribution if the sub-busy period ends in state j; thus A 2 dt/r i. a!? -).Let k E E_ denote the state in which level dt is downcrossed by the sub-busy period.If kj, a contribution to A 3 can occur in two ways, either by a transition to j before time dt/Irkl or by a jump to some E E+, in which case the following (second) sub-busy period must terminate in state j which occurs w.p.
a (+ -) This second possibility also occurs if k-j, but then there is in addition a contribution from the event that no transition out of j occurs before time dt/rj] after the downcrossing which occurs w.p.
keE_,geE+ Aky tit+ ik rk kE_ Finally, decomposing A 4 as a contribution from a second sub-busy period and a passage to level 0 without state transitions yields Irk dtg j +a!? -) 1 dt Writing ij A1 + 1 -dt (A 2 + A 3 d-A4), subtracting ij from both sides and dividing by dr, we get Absorbing the fifth term into the first sum as the k-term and rewriting in matrix notation, the result follows.
Note that the matrix identity in Theorem 3.3 is of dimension E + x E_ and depends linearly on the elements Pij of , so that indeed we have E+ x E_ unknowns and as many linear equations.
Define the busy cycle C starting from J0 i, V 0 S 0 0 as C inf{t > Pi'St > 0} inf{t > Pi'Jt E E + }.Proposition 3.4: EiCi -iPi eit + )A )-Proof: Obviously, the idle period C i-Pi is phase-type with phase generator A (--).The initial vector is the distribution of J p., which is just a!.+ ).Thus, the results follow from general distributions.
I-1 formulas for the mean of phase-type The busy cycles Ci are not regenerative for {(Vt, Jt)} but semi-regenerative.A proper regenerative cycle C is defined by fixing E E+ and adding up cycles until a second cycle of type occurs.That is, C inf {t > 0:V O,J i}.Proposition 3.5:   where Proofi Consider the E +-valued discrete-time Markov chain obtained by observing {Jr} just after the beginning of busy cycles, and let t/-(rlj)j ft.E denote its stationary distribution.
Then (3.2) holds by general results on semi-regenerative pr-cess ( [5], p. 2281, so we only have to verify the asserted expression for r/j.But consider a large time interval [0, T].Conditioning upon the state G E_ of {J,} just before a busy cycle starts from j G E+ shows that the expected number of such.cycles is lEE_ o lEE_ of.(2.1).ttence the proportion of busy cycles starting from j among all busy cycles is approximately given by (3.31, and from this the result follows by letting T--,cx (the argument is essentially "conditional PASTA", cf.[18]).We first introduce a suitable matrix generalization of the m.g.f.Define F as the measure- valued matrix with ijth entry Ft[i,j;x] Pi[St < x;J -j], and [et[s as the matrix with ijth sS entry t[i, j; s]-Ei[e ;Jr j] (thus, F[s] may bc viewed as the matrix m.g.f, of F defined by cntrywisc integration).Let further K[s] h + sAr.Since obviously Ft[s is strictly positive (and defined for all real s), it follows that K[s] has a simple and unique eigenvalue x(s) with maximal real part, such that the corresponding left and right eigenvectors t,(s'), h (s) may be taken with strictly positive components.We. shall use the normalization r,(S)e r,(S)h (s) 1.Note that since K[0] = A, we have t () r, h (0) e.
The following result, which is proved in the Appendix, shows that the function n(s) plays the role of an appropriate generalization of the cumulant g.f. as well as it shows how to compute the asymptotic mean and variance directly from the model parameters.where D is the matrix (A-er)-1.
The function g(s) is finite for all O, has x'(0) < 0 (cf.(1.1), (4.1)) and converges to cxz as scxz.In particular, a 7 > 0 with t(7 -0, a 7o > 0 with '(7o)-0 and (for y > 1/ max E /ri) an (u > 7o with a'(cu)exist, see Figure 4.1.Since 7 plays a special role, we write h-h (w).Then for any i, t,F_ieStht) et'()h!O) In particular, Proof: For the first assertion, just note that :, It follows from the below results (e.g.Theorem 5.1) that a large value of h can be interpreted as being a state such that starting from Jo i, {Vt} grows rapidly in its initial phase (before {Jr} reaches equilibrium).
For the time-reversed process {(t, Yt)},:-AIK'A=.From this it follows easily that n (in particular, 7, 70, etc. remain the same), whereas h'A=, A.U u'.A large value of h can be interpreted as being a state for which Jt is likely to occur for large values of Vt, cf.e.g.Corollary 4.7 and Theorem 7.1 below.

Likelihood Ratio Identities
We now turn to the construction of an exponential family of fluid models, such that the 0 member has a changed intensity matrix A(0)-(A!))i,j e E, but is otherwise unchanged (in particular, the r are the same), and that the case 0-0 corresponds to the given process, i.e. ^(0)-^.
The relevant choice turns out to be That is, A(0).
The stationary distribution r (0) is given by Proof."Since the off-diagonal elements are non-negative, it suffices to verify A(0)e-0.But

A(O)e A)(K[O]-
Similarly, the components of u(O)A non-negative, sum to one in view of v()h (0) 1 and The idea behind the likelihood ratio method is basically to change the mean drift lEE of {St} from negative to positive values, thereby giving rare events like { + (u)} P0, i-probability one.The following result shows that this is attained for 0 70" Let 0() denote the cumulant g.f. for the P0,/-process.=r(0)a)p[ + 0Jan(0), from which (a) follows.Differentiating w.r.t, c and letting c-0 yields (b).
Now let Po, be the governing probability measure for the fluid model which is governed by A(0) and the r and has initial environment J0 i.In the Appendix, we show the following likelihood ratio identity, which is our fundamental tool in the following.Its origin is results for discrete-time Markov additive processes obtained by Bellman [12], Tweedie [41] and Miner [32]; a similar identity for Markov-modulated M/G/1 queues is exploited in Asmussen [6] (cf. also Asmussen & Rolski [10]).For a survey of the likelihood ratio method for simple queues and random walks, see [5] Ch.XII.
Here is a first quick application of the likelihood ratio method.For.each 0, define 3) (in its time-reversed version) with r + (u), G { + (u) < }, and noting that $7 (u) u by the skip-free property, (2.9) yields P(V > u, J i) iPi( + (u) < ) i)e-U[o,i eo) (so that (0) 0) yields + () (Y > u, J i) 7r ihie "uE.;, = + (4.5) From this the corollary follows by trivial estimates, l-! Compared to the exact solutions in the literature, the advantage of Corollary 4.7 is of course that less computations are required.For example, we can compute (/9) by Elsner's algorithm ([33]) which automatically gives us also h (0).In queueing theory, corresponding inequalities for the GI/G/1 queue have been derived by Kingman and Ross (see, e.g., the survey in Stoyan [37]).
In fact, the argument in the proof of Corollary 4.7 can be strengthened to show that P(V > u,J i) is asymptotically exponential.This fact follows of course from the phase-type form of (Y > u,J i)(see [33]), but we shall give the result anyway since the proof is short, given some auxiliary results (that are needed below for other purposes) and since the form of the constants which come out in this way is more suitable for comparison with other results of the paper.

Cycle Maxima and Rare Events
The distribution of the cycle maximum Mv(Pi) sup V sup S o<t<P o<t<P is of interest for a variety of reasons: if x is the buffer size, P(Mv(Pi) > x) can be interpreted as the probability of buffer overflow within a busy cycle; and the set of Pi-distributions of Mv(Pi) lead to the extreme value behavior of {Vt} as explained below.
Proof: In just the same way as in (4.4), we have ei(Mv(Pi) > u)-ei(v + (u) < + () using '.; i(" + (u)--oc) 1 expression in the last step.Thus the result follows with the preliminary D C P.; i(Pi cx) ulLrn_.; i l .h jr for D C. But by Lemma 4.8, P.; i(Pi < cx) er + )(7)e A-10( -4-)Abe A-1( + )h, and as in the proof of Corollary 4.9, the limit in (5.2)is (:( + )(7)A/ le. I"l The study of cycle maxima in queueing theory was initiated by Takcs [40], who found the exact distribution for the M/G/1 queue (for a simple proof of his result, see Asmussen & Perry [9]).For fluid models, one can as in [9] find a representation of the exact distribution of My(Pi)   in terms of the lifetime of a non-homogeneous Markov process,)the time-dependent intensities of which can be expressed in terms of the matrices (-+ ), (+ U(-), but we shall not give the details.
The GI/G/1 analog of Theorem 5.1 was obtained by Iglehart [24] and extended to more general queues in [9].
We shall not apply Theorem 5.1 to rare events analysis.To this end, we need first to translate Theorem 5.1 into a similar statement on the maximum Mv(C) of {Vt} within the regenerate cycle C defined in Section 3.  Now let 7v(U inf{t > 0: V >_ u} be the first occurrence of the rare event {V _> u}.Corollary 5.3: As u-oc, e-Uv(U) is asymptotically exponential with rate parameter * D i.That is, for all x >_ 0 and all j E E, > Proof: This follows immediately from Lemma 5.2 and standard results on rare events in rege- nerative processes (e.g.Gnedenko  Next consider the extreme value My(T).Note that (by general regenerative processes theory) Corollaries 5.3, 5.4 hold for arbitrary starting values J0i, V 0 x.
The GI/G/1 version of Corollary 5.4 was proved in [24] and extended to more general queues in [9].

El
Proof of Theorem 6.1" Assertions (a)-(e) follow by easy combinations of Lemma 6.2 and the law of large numbers for Markov processes.E.g., E I(Jt-k,J-j) t<T T f I(J lc)dt o Thus letting T v + (u), (u) I(INkj(u)/Tk(u))-") > ), the assumptions of Lemma 6.2 are satisfied with a 0, and (c) follows.For (f), let similarly t(u) I(Ir + (u)/u-1/g'( 7) > e), and appeal to Lemma 7.2 below.El Theorem 6.1 may be seen as an analog of GI/G/1 results of Asmussen [4] (see in particular Theorem 5.1 of that paper).See also Anantharam [1].
The result has implications for quick simulation.Assume we want to estimate the probability Pi(v + (u)< Pi) of buffer overflow within a busy cycle by simulation.The crude Monte Carlo method has the typical problem of rare events simulation (a low relative precision so that an excessive number of replications is needed), and thus we may want to speed up the simulation by a change of measure.Formally, the simulation can be seen as picking a point at random from the probability space ([2,,), where [2 is the set of all sample paths {(Jt, Vt)}o<_t<_ r+ (u) AP i, ff the obvious r-field and F the restriction of Pi to (,ff).The change of measure amounts to simulating from a different P, i.e. to use importance sampling, and by general results from that area, the optimal P is given by Fi(" v + (u) < Pi)" This choice is not practicable, one among many reasons being that the likelihood ratio involves the unknown probability i(-+ (u) < Pi)" However, by Theorem 6.1 Pi(" '+ (u) < Pi) " PT;i(" Iv + (u) < Pi), which suggests to simulate simply from PT;i; the transition from PT;i(.involves no asymptotic loss of efficiency since the PT;i-probability of the conditioning event {r + (u) < Pi} has a strictly positive limit (viz., P.i(Pi oe)), in contrast to what is the case for [Pi" The corresponding simulation estimator is e "yu hi hj I('+(u)<Pi)" +() Obviously, its PT;i-variance is O(e-27u), i.e. of the same order of magnitude as Pi(r + (u) < Pi) 2 (this is roughly the optimality criterion used in Chang et al. [15]).
Estimation of the steady-state probability IP(V > u,d -i) can be carried out, in a similar way by simulating {(Jt, oct)} from PT;i and using the estimator e-TU 7rihi hJ " (u) r+ cf.(2.8).There is a straightforward analog of Theorem 6.1 for that settgtoo.When estimating P(V T > u,J T -i), the results of the next section suggest to simulate {(Jr, St)) from Fu;i where y-T/u.
Note that the approach to rare events simulation is most of the literature (e.g.Bucklew, Ney   & Sadowski [14], Parekh Walrand [34] and Cottrell, Fort & Malgouyres [17]) takes a somewhat different approach via the general theory of large deviations.For fluid models, see in particular Kesidis & Walrand [26].
For the proof, we shall need some lemmas.
Letting Y+ (u) in the first limit and noting that S~( u) u yields r+ and (7.2) then follows by applying Anscombe's theorem to the second limit.E] Lemma 7.3: + (u) and J~are asymptotically independent.That is, -+ () Proof: Easy along the lines of the proof of Stam's lemma in [5], pp. 271-272.

Y
The main difficulty in making the proof precise is that one needs a sharpened version of the CLT for -+ (u) (basically a local CLT with remainder term).However, also (7.7) needs a more rigorous proof.
The form of the above results originate from classical collective risk theory, a setting which is mathematically equivalent to the M/G/1 queue.Thus Theorem 7.1 was proved in that framework by Segerdahl [36], whereas Theorem 7.6 goes back to Gerber [21] (in the setting of [21], Theorem 7.6 takes the form of an exact inequality) and (7.6) to Arfwedson [3].The present proof appears to use less information than is inherent in the definition of cu, 7u.However, as in [21] this definition will produce the maximal 7y for which the argument works.The idea behind the choice of cy is essentially the saddlepoint method, to make E a ' (u) ' T yu.
y;" + Here is an estimate of the rate of convergence to the steady state which is different from (7.8) by fixing u and letting only T--oc.Theorem 7.9: Let 70 > 0 satisfy '(7o) 0 and let 5 e (7).Then o < e(v > J i)-e.(vr > ,Jr i) < l'o)c (o)-'" (7.10) Proof: Replace a y by 70 and yu by T in the proof of Theorem 7.8.

V1
The aI/a/1 version of Theorem 7.9 is due to gooko [3].we conjecture that the condition of stationary initial conditions for {Jr} is not critical for the rates in Theorem 7.8, 7.9   and that (of.standard relaxation time results for simple queues, e.g.[5], pp. 95, 262-262)the correct rate of convergence in Theorem 7.9 is T/T3/2.Note that (7.10) can be seen as a limiting case of (7.4) (except for the constant 1/2 there).
Indeed, if we write T yu with u fixed, we have y-oc which implies cyl70 and e 7'.6T e 7u + uu,(70) ,, e h! O) Ou j).
That o;i(Jr_ (u) J) has the asserted form is obvious.The case a < (70) is treated as above.El The difficulty in applying Theorem 8.a and Corollary 8.2 is that the explicit form of 0-x-l(a)is complicated, and that the matrices a!-+) (0) and U(-)(0) do not appear to reduce (except for the case 0-7, cf.Lemma 4.8); see however, the next section for a simple example.However, the less ambitious goal of computing the mean of v_ (u) is attainable by a direct argument.Note that for u-0, this has already been carried out in Section 3 by computing the matrix P.
Define M(-)(u):E_ E_ as the matrix with ijthe element [i[r_ (_u);Jr_ (u) J] an.d M + )(u): E + x E as the matrix with ijth element Ei[r (u); Jr (u) J]" Note that it is easily seen that asymptotically The following result gives an exact expression incorporating also information on the Jr component.
This leads to M )' (u)-M -)(u)U + eUuM(-)(0) and the solution of this differential equation subject to the obvious boundary condition M -)(0)-0 is indeed (8.4).Furthermore, du i du ) du 5ij+ E ,ikdu j] which immediately leads ot the asserted expression for M(-)'(O).Finally for M(+)(u) and E E+, we decompose r_ (u) as Pi + r*__ (u), where r*__ (u) has the same distribution as r_ (u) stated from Jpi.Given gpi k, the conditional probability of Jr (u) J is e'e U( )Uej, and the conditional expectation of r*__ (u)I(J._(u)= J is e'M(u)ej.From this the asserted formula du Sij + E ikdu
(10.3)To see that Po;io;i, it suffices to consider the one-dimensional transition function or equivalently the matrix m.g.f., i.e to show that g-o; k e JT J i, jeE But the 1.h.s. is E0;iLe ;JT-J which is the same as the r.h.s., el. the proof of Proposition 4.5.
It follows that for G E 5 T, e0; LL r In particular, if G r,G {7 T}, we have G T and get EO; i[I (G)Eo; [- using (10.3).Now consider a general G E r" Then G T G G {r <_ T} satisfies G T r, GT g {r _< T}.Thus, according to what has just been proved, (4.3) holds with G replaced by G T. Letting Ttoc and using monotone convergence then shows that (4.3) holds for G as well.

4 .
Change of Measure via Exponential Families
where H has the extreme value distribution P(H <_ x)-e --x Proof: By Lemma 5.2 and [9],Corollary 10.1. becomes