Time reversal of Volterra processes driven stochastic differential equation

We consider stochastic differential equations driven by some Volterra processes. Under time reversal, these equations are transformed into past dependent stochastic differential equations driven by a standard Brownian motion. We are then in position to derive existence and uniqueness of solutions of the Volterra driven SDE considered at the beginning.


Introduction
Fractional Brownian motion is one the first example of a process which is not a semi-martingale and for which we aim to develop a stochastic calculus. That means we want to define a stochastic integral and solve stochastic differential equations driven by such a process. From the very beginning of this program, two approaches do exist. One approach is based on the sample-paths properties of fBm, mainly its Hölder continuity or its finite p-variation. The other way to proceed relies on the gaussiannity of fBm. The former is mainly deterministic and was initiated by Zähle [41], Feyel, de la Pradelle [12] and Russo, Vallois [31,32]. Then, came the notion of rough paths introduced by Lyons [22], whose application to fBm relies on the work of Coutin, Qian [4]. These works have been extended in the subsequent works [20,21,3,15,14,23,26,27,16,25,8]. A new way of thinking came with the independent but related works of Feyel, de la Pradelle [13] and Gubinelli [17]. The integral with respect to fBm was shown to exist as the unique process satisfying some characterization (analytic in the case of [13], algebraic in [17]). As a byproduct, this showed that almost all the existing integrals throughout the literature are all the same as they all satisfy these two conditions. Behind each approach but the last too, is a construction of an integral defined for a regularization of fBm, then the whole work is to show that under some convenient hypothesis, the approximate integrals converge to a quantity which is called the stochastic integral with respect to fBm. The main tool to prove the convergence is either integration by parts in the sense of fractional deterministic calculus, either enrichment of the fBm by some iterated integrals proved to exist independently or by analytic continuation [37,36].
In the probabilistic approach [7,6,5,9,19,30,29,2,1], the idea is also to define an approximate integral and then prove its convergence. It turns out that the key tool is here the integration by parts in the sense of Malliavin calculus.
In dimension greater than one, with the deterministic approach, one knows how to define the stochastic integral and prove existence and uniqueness of fBm driven SDEs for fBm with Hurst index greater than 1/4. Within the probabilistic framework, one knows how to define a stochastic integral for any value of H but one can't prove existence and uniqueness of SDEs whatever the value of H. The primary motivation of this work was to circumvent this problem.
In [7,9], we defined stochastic integrals with respect to fBm as a "damped-Stratonovitch" integral with respect to the underlying standard Brownian motion. This integral is defined as the limit of Riemann-Stratonovitch sums, the convergence of which is proved after an integration by parts in the sense of Malliavin calculus. Unfortunately, this manipulation generates non-adaptiveness: Formally the result can be expressed as Even if u is adapted (with respect to the Brownian filtration), the process (s → K * t u(s)) is anticipative. However, the stochastic integral process (t → t 0 u(s) • dB H (s)) remains adapted so the anticipativeness is in some sense artificial. The motivation of this work is to show that up to time reversal, we can work with adapted process and Itô integrals.In what follows, there is no restriction about the dimension but we need to assume that for any component B H is an fBm, the Hurst index of which is greater than 1/2.
Consider that we want to solve the equation where σ is a deterministic function whose properties will be fixed below. It turns out that it is essential to investigate the more general equations: The strategy is then the following: We will first consider the reciprocal problem: The first critical point is that when we consider {Z r, t := Y t−r, t , r ∈ [0, t]}, this process solves an adapted, past dependent, stochastic differential equation with respect to a standard Brownian motion. Moreover, because K H is lower-triangular and sufficiently regular, the trace term vanishes in the equation defining Z. We have then reduced the problem to an SDE with coefficients dependent on the past, a problem which can be handled by the usual contraction methods. This paper is organized as follows: After the preliminaries of Section 2, we address, in Section 3, the problem of Malliavin calculus and time reversal. This part is interesting in its own since stochastic calculus of variations is a framework oblivious to time. Constructing such a notion of time is achieved using the notion of resolution of the identity as introduced in [40]. We then introduce the second key ingredient which is the notion of strict causality or quasinilpotence, see [42] for a related application. In Section 4, we show that solving Equation (B) reduces to solve a past dependent stochastic differential equation with respect to a standard Brownian motion, see Equation (C) below. In Section 5, we prove existence, uniqueness and some properties of this equation. Technical lemmas are postponed to Section 6.

Malliavin calculus and time reversal
Our reference probability space is Ω = C 0 ([0, T ], R n ), the space of R n -valued, continuous functions, null at time 0. The Cameron-Martin space is denoted by H and is defined as ). In what follows, the space L 2 ([0, T ]) is identified with its topological dual. We denote by κ the canonical embedding from H into Ω. The probability measure P on Ω is such that the canonical map W : ω → (ω(t), t ∈ [0, T ]) defines a standard n-dimensional Brownian motion. A mapping φ from Ω into some separable Hilbert space H is called cylindrical if it is of the hal-00509900, version 1 -17 Aug 2010 For such a function we define ∇ W φ as whereṽ is the image of v ∈ Ω * by the map (I 1 0 + • κ) * . From the quasi-invariance of the Wiener measure [39], it follows that ∇ W is a closable operator on L p (Ω; H), p ≥ 1, and we will denote its closure with the same notation. The powers of ∇ W are defined by iterating this procedure. For p > 1, k ∈ N, we denote by D p,k (H) the completion of H-valued cylindrical functions under the following norm We denote by L p,1 the space D p,1 (L p ([0, T ]; R n ). The divergence, denoted δ W is the adjoint of ∇ W : v belongs to Dom p δ W whenever for any cylindrical φ, and for such a process v, We need first to introduce the "time reversal" operator, denoted by τ T and defined by: We introduced the temporary notation W for standard Brownian to clarify the forthcoming distinction between a standard Brownian motion and its time reversal. Actually, the time reversal of a standard Brownian is also a standard Brownian motion and thus, both of them "live" in the same Wiener space. We now precise how their respective Malliavin gradient and divergence are linked. Consider  Note that Θ −1 The operator ∇ = ∇ B (respectively∇ = ∇B) is the Malliavin gradient associated with a standard Brownian motion (respectively its time reversal). Since, we can consider f (ω(t 1 ), · · · ,ω(t k )) as a cylindrical function with respect to the standard Brownian motion. As such its gradient is given by We thus have, for any cylindrical function F , Since Θ * T P = P and τ T is continuous from L p into itself for any p, it is then easily shown that the spaces D p, k andĎ p, k (with obvious notations) coincide for any p, k and that (4) holds for any element of one of theses spaces. Hence we have proved the following theorem: Theorem 3.1. For any p ≥ 1 and any integer k, the spaces D p, k andĎ p, k coincide. For any F ∈ D p, k for some p, k, By duality, an analog result about follows for divergences.

Theorem 3.2. A process u belongs to the domain of δ if and only if τ T u belongs
to the domain ofδ and then, the following equality holds: Proof. For h ∈ L 2 , for cylindrical F , we have on the one hand: and on the other hand, Since this is valid for any cylindrical F , (5) holds for h ∈ L 2 . Now, for u in the domain of divergence (see [28,39] where we have taken into account that τ T in an involution. (5) is satisfied for any u in the domain of δ.
3.1. Causality and quasi-nilpotence. In anticipative calculus, the notion of trace of an operator plays a crucial role, we refer to [10] for more details on trace.
Then, the trace of V is defined by It is easily shown that the notion of trace does not depend on the choice of the CONB.
Its mesh is denoted by |π| and defined by |π| = sup i |t i+1 − t i |. For t ∈ π\{T }, t + is the least term of π strictly greater than t.
The causality plays a crucial role in what follows. The next definition is just the formalization in terms of operator of the intuitive notion of causality.

Definition 3.4. A continuous map V from an Hilbert space H into itself is said to be E-causal if the following condition holds:
needs only the knowledge of f up to time t and not after. Unfortunately, this notion of causality is insufficient for our purpose and we are led to introduce the notion of strict causality as in [11].
hal-00509900, version 1 -17 Aug 2010 Definition 3.5. Let V be a causal operator. It is a strictly causal operator whenever for any ε > 0 there exists a partition π of [0, T ] such that for any π ′ ⊂ π, Note carefully that the identity map is causal but not strictly causal.
However, for γ > 0, we have the following result: Proof. Let π be any partition of [0, T ]. Assume E = (e λT , λ ∈ [0, 1]), the very same proof (replacing t + by t − and reordering bounds in the integrals) works for the other mentioned resolution of the identity. According to Hölder formula, we have: For any t ∈ π, Thus for any ε > 0, there exists η > 0 such that The proof is thus complete.
The importance of strict causality lies in the next theorem we borrow from [11]. Moreover, we have the following stability theorem. Consider the filtration F E defined as , the notion of F E adaptedness coincides with the usual one for the Brownian filtration and it is well known that a process u is adapted if and only if ∇ W r u(s) = 0 for r > s. This result can be generalized to any resolution of the identity.
Unfortunately such a representation as an integral operator is not always available. We give here an algebraic proof to emphasize the importance of causality.
Proof. This is a purely algebraic lemma once we have noticed that (6) τ T e r = (Id −e T −r )τ T for any 0 ≤ r ≤ T.
For, it suffices to write We have to show that Use (7) again to obtain

Stochastic integration with respect to Volterra processes
In what follows, η belongs to (0, 1] and V is a linear operator. For any p ≥ 2, we set  I(p, η) holds. The Volterra process associated to V , denoted by W V is defined by

Definition 4.2.
We say that u is V -Stratonovitch integrable on [0, t] whenever the family R π (t, u), defined in (8), converges in probability as |π| goes to 0. In this case the limit will be denoted by
For, we remark that
Note that in this last example we do not have an expression of V as a kernel operator. This justifies the operator-theoretic approach of the sequel.
The next theorem then follows from [7]. For the sake of completeness, we reproduce its proof.

Theorem 4.2.
Assume that Hypothesis I(p, η) holds. Assume that u belongs to L p,1 . Then u is V -Stratonovitch integrable, there exists a process which we denote by D W W u such that D W u belongs to L p (P ⊗ ds) and

Moreover, for any r ≤ T , e r u is V Stratonovitch integrable and
Proof. Since u belongs to L p, 1 , dP ⊗ dr-a.s., the map (s → ∇ W r u s ) belongs to L p . Then, hypothesis I(p) entails that (s → V (∇ W r u) s ) is η-Hölder continuous. The map

hal-00509900, version 1 -17 Aug 2010
is measurable, hence the process Then, we have The remaining of the proof of (9) follows the classical proof for convergence of Stratonovitch sums as exposed in [28].

hal-00509900, version 1 -17 Aug 2010
Moreover, Let P k be the projection onto the span of the φ k, m , since ∇ W V u is of trace class, we have (see [34]) According to the proof of Theorem 4.2, the first part of the theorem follows. The second part is then a rewriting of (10).
There is another result from [7] which is worth quoting for the sequel.
where c does not depend on u.
We can then follow the approach given for Stratonovitch integral as in [28] and show that we have a substitution formula. For p ≥ 1, let Γ p be the set of random fields: equipped with the semi-norms, for any compact K of R m . Hypothesis I(p, η) holds. Let {u(x), , x ∈ R m } belong to Γ p . Let F be a random variable such that ((ω, s) → u(ω, s, F )) belongs to L p,1 . Then,

Corollary 4.5. Assume that
Proof. Simple random fields of the form

hal-00509900, version 1 -17 Aug 2010
with H l smooth and u l in L p,1 are dense in Γ p . In view of (12), it is sufficient to prove the result for such random fields. By linearity, we can reduce the proof to random fields of the form H(x)u(ω, s). Now for any partition π, On the other hand, According to Theorem 4.2, Eqn. (13) is satisfied for simple random fields.

Definition 4.3. For any
Proof. The map τ T A ⊗ B is of trace class if and only if for (h n , n ≥ 1) a CONB of hal-00509900, version 1 -17 Aug 2010 since τ T is self-adjoint in L 2 . The first result follows. The second part follows by adjunction.
Corollary 4.7. Let u ∈ L 2,1 such that ∇ W ⊗ τ T V u and ∇ W ⊗ V τ T u are of trace class. Then, τ T ∇ W ⊗ V u and ∇ W τ T ⊗ V u are of trace class. Moreover, we have: Proof. For u simple, i.e., of the form the result follows from Lemma 4.6. Such random fields are dense in Γ p and according to Theorem 4.3, the trace function is continuous on Γ p hence the result is satisfied in full generality. Hypothesis I(p, η) holds. Let u belong to L p,1 and leť

Theorem 4.8. Assume that
T −r Proof. We first study the divergence term. In view of 3.2, we have According to Theorem 3.8, (V T ) * isĚ 0 causal and according to 3.3, it is strictly E 0 causal. Thus, Theorem 3.7 implies that∇V (e t − e r )ǔ is of trace class and quasi-nilpotent. Hence Lemma 4.7 induces that is trace-class and quasi-nilpotent. Now, according to Theorem 3.1, we have According to Theorem 4.2, we have proved (14). ∇u is not regular enough for such an expression of the trace to be true. Even more, there is absolutely no reason forV T ∇u to be a kernel operator so we can't hope such a formula. These are the reasons that we need to work with operators and not with kernels.

Volterra driven SDEs
Let G the group of homeomorphisms of R n equipped with the distance: We introduce a distance d on G by Then, G is a complete topological group. Consider the equations Definition 5.1. By a solution of (A), we mean a measurable map such that the following properties are satisfied : For any 0 ≤ r ≤ T , for any x ∈ R n , the processes (ω, t) → X r,t (ω, x) and (ω, t) → X −1 r,t (ω, x) belong to L p,1 for some p ≥ 2.
(3) For any 0 ≤ r ≤ s ≤ t, for any x ∈ R n , the following identity is satisfied: X r,s (ω, x)).

Definition 5.2. By a solution of (B), we mean a measurable map
such that the following properties are satisfied : (1) For any 0 ≤ r ≤ t ≤ T , for any (2) For any 0 ≤ r ≤ T , for any x ∈ R n , the processes (ω, r) → Y r,t (ω, x) and (ω, r) → Y −1 r,t (ω, x) belong to L p,1 for some p ≥ 2. (3) Equation (B) is satisfied for any 0 ≤ r ≤ t ≤ T P-a.s.. (4) For any 0 ≤ r ≤ s ≤ t, for any x ∈ R n , the following identity is satisfied: Y s,t (ω, x)).
At last consider the equation, for any 0 ≤ r ≤ t ≤ T , where B is a standard n-dimensional Brownian motion.
Definition 5.3. By a solution of (C), we mean a measurable map such that the following properties are satisfied : For any 0 ≤ r ≤ t ≤ T , for any x ∈ R n , the processes (ω, r) → Z r,t (ω, x) and (ω, r) → Z −1 r,t (ω, x) belong to L p,1 for some p ≥ 2.
Since this proof needs several lemmas, we defer it to Section 6.
According to Theorem 4.8, Y is satisfies (B) if and only if Z satisfies (C). The regularity properties are immediate since L p is stable by τ T .
The first part of the next result is then immediate.
hal-00509900, version 1 -17 Aug 2010 Corollary 5.3. Assume thatV T is an E 0 causal map continuous from L p into I α,p for α > 0 and p ≥ 2 such that αp > 1. Then Equation (B) has one and only solution and for any 0 ≤ r ≤ s ≤ t, for any x ∈ R n , the following identity is satisfied: Y s,t (ω, x)).
Proof. According to Theorem 5.2 and 5.1, (B) has at most one solution since (C) has a unique solution. As to the existence, point (1) to (3) are immediatly deduced from the corresponding properties of Z and Equation (15). According to Theorem 5.1, (ω, r) → Y r,s (ω, Y s,t (ω, x)) belongs to L p,1 hence we can apply the subsitution formula and we get: Then, in view of (17), R appears to be the unique solution (B) and thus R s,t (ω, x) = Y s,t (ω, x). Point (4) is thus proved.
We still denote by Y this continuous version.
Proof. W.l.o.g. assume that s ≤ s ′ and remark that Y s, s ′ (x) thus belongs to σ{B T u , u ≥ s}.
According to Theorem 6.2, In view of Theorem 4.8, the stochastic integral which appears in Equation (C) is also a Stratonovitch integral hence we can apply the substitution formula and say Thus we can apply Theorem 6.2 and we obtain hal-00509900, version 1 -17 Aug 2010 The right hand side of this equation is in turn equal to E [|Z 0,s ′ − Z s ′ −s,s ′ (x)| p ] , thus, we get Combining (18) and (19) gives hence the result.
Thus, we have the main result of this paper. Proof. Under the hypothesis, we know that Equation (B) has a unique solution which satisfies (16). By definition of a solution of (B), the process Y −1 : (ω, s) → Y −1 st (ω, x) belongs to L p,1 hence we can apply the substitution formula. Following the lines of proof of the previous theorem, we see that Y −1 is a solution of (A).
In the reverse direction, two distinct solutions of (A) would give raise to two solutions of (B) by the same principles. Since this is definitely impossible in view of Theorem 5.3, Equation (A) has at most one solution.
Since σ • ψ belongs to C([0, T ], R n ), according to Hypothesis I, we have: The proof is thus complete.
Following [38], we then have the following non trivial result. Theorem 6.2. Assume that Hypothesis I holds and that σ is Lipschitz continuous. Then, there exists one and only one measurable map from Ω × [0, T ] × [0, T ] into G which satisfies the first two points of Definition (C). Moreover, Note even if x and x ′ are replaced by σ{B T (u), t ≤ u} measurable random variables, the last estimate still holds.
Proof. Existence, uniqueness and homeomorphy of a solution of (C) follow from [38]. The regularity with respect to r and x is obtained as usual by BDG inequality and Gronwall Lemma. For x or x ′ random, use the independance of σ{B T (u), t ≤ u} and σ{B T (u), r ∧ r ′ ≤ u ≤ t}. Corollary 6.3. Assume that Hypothesis I holds and that σ is Lipschitz continuous and sub-linear. Let Z be a solution of (C). Then, for any x ∈ R n , for any 0 ≤ r ≤ t ≤ T, we have .