Approximating explicitly the mean reverting CEV process

In this paper we want to exploit further the semi-discrete method appeared in Halidias and Stamatiou (2015). We are interested in the numerical solution of mean reverting CEV processes that appear in financial mathematics models and are described as non negative solutions of certain stochastic differential equations with sub-linear diffusion coefficients of the form $(x_t)^q,$ where $\frac{1}{2}<q<1.$ Our goal is to construct explicit numerical schemes that preserve positivity. We prove convergence of the proposed SD scheme with rate depending on the parameter $q.$ Furthermore, we verify our findings through numerical experiments and compare with other positivity preserving schemes. Finally, we show how to treat the whole two-dimensional stochastic volatility model, with instantaneous variance process given by the above mean reverting CEV process.


24
7 Order of convergence of SD with θ = 0 approximation of (2.1) with (x 0 , k 1 , k 2 , k 3 , T ) = (1, 2, 2, 1, 1), for different values of q.  Consider the following stochastic models , where S t represents the underlying financially observable variable, V t is the instantaneous volatility when p = 1/2 or the instantaneous variance when p = 1 and the Wiener processes W t , W t have correlation ρ.
We assume that V t is a mean reverting CEV process of the above form, with the coefficients k i > 0 for i = 1, 2, 3 and q > 1/2, since the process V t has to be non-negative. To be more precise the above restriction on q implies that V t is positive, i.e. 0 is unattainable, as well as non explosive, i.e. ∞ is unattainable. The steady-state level of V t is k 1 /k 2 and the rate of mean reversion is k 2 .
The system (1.1) for p = q = 1/2 is the Heston model. When q = 1 we get the Brennan-Schwartz model [3, Section II], which apparent its simple form, cannot provide analytical expressions for V t .
We aim for a positive preserving scheme for the process V t . The explicit Euler scheme fails to preserve positivity, as well as the standard Milstein scheme. We intent to apply the semi-discrete method for the numerical approximation of V t in model (1.1) and compare with the effort made in that direction by [13].
Section 2 provides the setting and the main results, Theorems 2.1 and 2.2, concerning the L 2 −convergence of the proposed Semi Discrete (SD) method to the true solution of mean reverting CEV processes of the form of the stochastic volatility in (1.1). Section 3 is devoted to the proof of Theorem 2.1, while Section 4 concerns the proof of Theorem 2.2. In Section 5 we present an alternative approach inspired by [10] and we state and prove Theorem5.1 which concerns the analogues of Theorems 2.1 and 2.2. The main ingredient of this approach is the simplification of the numerical scheme proposed, by removing altering the initial Brownian motion (W t ) to another Brownian motion (Ŵ t ) justified by Levy's martingale characterization of Brownian motion, yielding to the same logarithmic rate as in Theorem 2.1 but to a better polynomial rate 1 2 (q − 1 2 ) instead of 1 2 (q − 1 2 ) ∧ 1 8 as shown in Theorem 2.2. Finally, Section 6 presents illustrative figures where the behavior of the proposed scheme, regarding the order of convergence, is shown.
Theorem 2.1. [Logarithmic rate of convergence] Let Assumption A hold. The semi-discrete scheme (2.7) converges to the true solution of (2.1) in the mean square sense with rate given by where C is independent of ∆ and particularly where ǫ is such that 0 < ǫ < 1 4 ∧ (q − 1 2 ).
Assumption B Let Assumption A hold where now x 0 ∈ R and x 0 > 0.
In the following sections we write for simplicity y SD t or y t for y SD t (q).
We rewrite (2.4) in a compact form for t ∈ (t n , t n+1 ] wherê s = t j , s ∈ (t j , t j+1 ], j = 0, . . . , n, s = t j+1 , for s ∈ [t j , t j+1 ], t, for s ∈ [t n , t] j = 0, . . . , n − 1. We first observe that (y t ) is bounded in the following way s., where the lower bound comes from the construction of (y t ) and the upper bound follows from a comparison theorem. We will bound (u t ) and therefore (y t ), since 0 ≤ y t ≤ u t a.s. Set the stopping time where in the second step we have used that 0 ≤ y t ≤ u t , in the third step the inequality x p−1 y ≤ ǫ p−1 p x p + 1 pǫ p−1 y p , valid for x ∧ y ≥ 0 and p > 1 with ǫ = 1 2 , in the final step the fact 1 2 < q < 1 and M t := pk 3 √ y s dW s . Taking expectations in the above inequality and using that M t is a local martingale vanishing at 0, we get where we have applied Gronwall inequality ([7, Relation 7]). We have that thus taking expectations in the above inequality and using the estimated upper bound for E(u t∧τ R ) p we arrive at and taking the limit as R → ∞, we get Now we fix t. The sequence of stopping times τ R is increasing in R and t ∧ τ R → t as R → ∞, thus the sequence (y t ) p I {t<τ R } is nondecreasing in R and (y t ) p I {t<τ R } → (y t ) p as R → ∞, thus the monotone convergence theorem implies for any p > 2. Using again Ito's formula on (u t ) p , taking the supremum and then using Doob's martingale inequality on the diffusion term we bound E sup 0≤t≤T (u t ) p and thus E sup 0≤t≤T (y t ) p .
where we have used Cauchy-Schwarz inequality. Taking expectations in the above inequality and using Lemma 3.1 and Doob's martingale inequality on the diffusion term we conclude where the positive quantityÂ p except on p depends also on the parameters k 1 , k 2 , k 3 , θ, q but not on ∆. Now, for 0 < p < 2 we get where we have used Jensen inequality for the concave function φ(x) = x p/2 . Following the same lines we can show that for any 0 < p, where the positive quantity A p except on p depends also on the parameters k 1 , k 2 , k 3 , θ, q but not on ∆.
For the rest of this section we rewrite again the compact form of (2.4) in the following way where f θ (·, ·) is given by (2.2) and the auxiliary process (h t ) is close to (y t ) as shown in the next result.
and for s ∈ [t n , t n+1 ] we have that for any p > 0, where the positive quantities C p ,Ĉ p , C p , C h do not depend on ∆.
Proof of Lemma 3.3. We have that for any p > 0, where we have used (3.4). Using Lemma 3.1 we get the left part of (3.5). Now for p > 2 and noting that we get the right part of (3.5), where we have used Lemma 3.1. The case 0 < p < 2 follows by Jensen's inequality as in Lemma 3.2. Furthermore, for s ∈ [t n , t n+1 ] and p > 2 it holds (3.2) and in the same manner The case 0 < p < 2 follows by Jensen's inequality.

3.2.
Convergence of the auxiliary process (h t ) to (x t ) in L 1 . We first estimate the probability of z t being negative when at the same time y tn > ∆ 1−2ξ , for 0 < ξ < 1 2 .
where C k 2 ,k 3 ,θ,∆ := Proof of Lemma 3.4. By the definition (2.5) of (z t ) for t ∈ [t n , t n+1 ] and for 0 < ξ < 1 2 , we have that The following inclusion relations hold for the event A 1 , for every standard normal random variable G, where in the last step we have used [14, Ineq. (9.20), p.112] valid for β > 0. Using the fact that ∆Wn √ t−tn is a standard normal r.v. and ignoring the exponential term in (3.9), since its exponent is negative, we get that The following inclusion relations hold for the event A 2 , Using again (3.9) we have that Taking probabilities in the inclusion relation (3.8) and using (3.10) and (3.11) we get which justifies the O( √ ∆) notation, (see for example [18]).
We will use the representation (3.4) and write Proposition 3.5. Let Assumption A hold. Then we have for any m > 1, where e m = e −m(m+1)/2 and Proof of Proposition 3.5. Let the non increasing sequence {e m } m∈N with e m = e −m(m+1)/2 and e 0 = 1. We introduce the following sequence of smooth approximations of |x|, (method of Yamada and Watanabe, [19]) where the existence of the continuous function ψ m (u) with 0 ≤ ψ m (u) ≤ 2/(mu) and support in (e m , e m−1 ) is justified by , when e m < |x| < e m−1 and |φ ′′ m (x)| = 0 otherwise.
We have that It holds that where we have used properties of Holder continuous functions and namely the fact that x q is q−Holder continuous for q ≤ 1, i.e. |x q −y q | ≤ |x−y| q , and that x q is 1/2−Holder continuous where in the second step we have used (3.15) and (3.16) and the properties of φ m and Taking expectations in the above inequality yields where we have used Lemma 3.3 in the second step and Holder inequality, Lemmata 3.1 and 3.2 in the third step and the fact that where we have used Holder inequality and Lemma 3.4 in the third step and Lemmata 3.2 and 3.1 in the final step. For ξ = 1 2 − 1 16q we get the estimate Thus (3.14) becomes where in the second step we have used the asymptotic relations, for any κ ≤ 1 as m → ∞, in the last step we have used the Gronwall inequality and J 3 is as defined in Proposition 3.5 while Taking the supremum over all 0 ≤ t ≤ T gives (3.13).

Convergence of the auxiliary process
Proposition 3.6. Let Assumption A hold. Then we have where C ǫ is independent of ∆ and given by C ǫ : Proof of Proposition 3.6. We estimate the difference |E t | 2 := |h t − x t | 2 . It holds that where in the second step we have used Cauchy-Schwarz inequality and (3.15) and Taking the supremum over all t ∈ [0, T ] and then expectations we have 19) where in the second step we have used Lemma 3.2 and Doob's martingale inequality with p = 2, since M t is an R−valued martingale that belongs to L 2 . It holds that where we have used (3.16). Now, we use again the estimate (3.17) to get where we have used the asymptotic relations, ∆ l = o(∆ q− 1 2 ) for all l ≥ 1 2 as ∆ ↓ 0, J 6 := 3k 2 3 T A 2 E(x 0 + k 1 T ) 2 Â 4q−2 and J 2 is as given in the statement of Proposition 3.5 and depends on ∆ through C k 2 ,k 3 ,θ,∆ where as already stated before we have that C k 2 ,k 3 ,θ,∆ → k 3 , as ∆ ↓ 0.
Relation (3.19) becomes where we have used Proposition 3.5 in the second step with the sequence e m as defined there, Gronwall inequality in the last step and the asymptotic relation ∆ κ = o( ∆ κ mem ) as m → ∞, for any κ > 0 and J 5 is independent of ∆ and given by J 5 := 6T 2 k 2 2 ((1 − θ) 2Â 2 + θ 2 A 2 ). We take m = √ ln ∆ −λ , with λ > 0 to be specified soon and note that e Now, since q > 1 2 there is an ǫ > 0 small enough such that q − 1 2 − ǫ > 0. We take λ = 2ǫ 3 and conclude that as ∆ → 0 which in turn implies the asymptotic relation ∆ q− 1 2 mem = o( 1 m ) as m → ∞, with the logarithmic rate stated before. In the same way we can show ∆ , by taking 0 < ǫ < 1 4 ∧ (q − 1 2 ), which implies (3.18). 3.4. Proof of Theorem 2.1. In order to finish the proof of Theorem 2.1 we just use the triangle inequality, Lemma 3.3 and Proposition 3.6 to get where C = C(k 2 , k 3 , ǫ, T ), and given in the statement of Theorem 2.1.

Polynomial rate of convergence.
We work with the stochastic time change inspired by [2]. We define the process The process γ(t) is well defined since x s > 0 a.s. and y t ≥ 0 (see Section 2).
The difference |E t | 2 := |h t − x t | 2 is estimated as in Section 3 and we get, as in (3.19), that where τ a stopping time and J 5 independent of ∆ is as in proof of Proposition 3.6. The main difference here will be the estimation of the last term in (4.1). The approach in Section 3 resulted in the L 1 estimation E|E t | where we used he Yamada-Watanabe approach. Now, we use the Berkaoui approach. Relation (3.16) becomes where we have used the inequality valid for all a ≥ 0, b ≥ 0 and 1 2 ≤ q ≤ 1. Consequently, we get the upper bound where we used the estimate (3.17) and Holder inequality J 2 is as in the statement of Proposition 3.5 and J 6 independent of ∆ is as in proof of Proposition 3.6. Relation (4.1) becomes where we have used Lemma 3.3 in the second step. At this point we want to estimate the inverse moments of (x t ) and to do so we consider the transformation v = x 2−2q and apply Ito's formula to get Denote the drift coefficient of the process (v t ) by a(v t ) and consider the function where λ ≥ 0. Some elementary calculations show that this function attains its minimum at v * := k 1 (2q−1) Consider the process (ζ t (λ) defined through for t ∈ [0, T ] with ζ 0 (λ) = v 0 . Process (4.5) is a square root diffusion process and when is a CIR process which remains positive if ζ 0 (λ) > 0. By a comparison theorem ([14, Proposition 5.2.18]) it holds that v t ≥ ζ t (λ) > 0 a.s. or (v t ) −1 ≤ (ζ t (λ)) −1 a.s. or equivalently (x t ) 2q−2 ≤ (ζ t (λ)) −1 a.s. The inverse moment bounds of (ζ t (λ)) follow by ([5, by choosing big enough λ and particularly such that (4.6) holds strictly. Therefore, where in the last step we have used Gronwall's inequality. Using again relation (4.7) for τ = T and under the change of variables u = 6T k 2 s + γ s we get, where in the last steps we have used (4.8). We proceed by showing that u → P(γ T ≥ u)e u ∈ L 1 (R + ). It holds that for any ǫ > 0 by Markov inequality. The following bound holds where −1 < 2q − 2 < 0. It remains to bound the exponential inverse moments of (x t ) defined through the stochastic integral equation (2.1). Exponential inverse moments for the CIR process are known ([12, Theorem 3.1]) and are given by , where the positive constant C HK is explicitly given in [12,Relation (10)] depends on the parameters k 2 , k 3 , T, q but is independent of ζ 0 . Thus the other condition that we require for parameter λ is When (4.12) is satisfied then (4.6) is satisfied too, thus there is actually no restriction on the coefficient δ in (4.11) since we can always choose appropriately a λ such that (4.12) holds. Relation (4.10) becomes We therefore require that and can always find a ǫ > 1, such the above relation holds by choosing appropriately λ as discussed before. Relation (4.13) becomes and therefore where λ is chosen such that (4.14) holds with ǫ > 1. We conclude , is as given in statement of Theorem 2.2.

5.
Alternative approach improving the rate of convergence.
Inspired by [10] we remove the term sgn(z s ) from (2.4) by considering the process which is a martingale with quadratic variation < W t , W t >= t and thus a standard Brownian motion w.r.t. its own filtration, justified by Levy's theorem [14,Theorem 3.16,p.157]. Therefore, the compact form of (2.4) becomes for t ∈ (t n , t n+1 ]. Consider also the process The process (x t ) of (2.1) and the process ( x t ) of (5.2) have the same distribution. We show in the following that E sup 0≤t≤T |y SD t (q) − x t | 2 → 0 as ∆ ↓ 0 thus the same holds for the unique solution of (2.1), i.e. E sup 0≤t≤T |y SD t (q) − x t | 2 → 0 as ∆ ↓ 0. To simplify notation we write W , ( x t ) as W, (x t ). We end up with the following explicit scheme where y n is as in (2.6).

Theorem 5.1. [Logarithmic and Polynomomial rate of convergence] Let Assumption A hold.
The semi-discrete scheme (5.3) converges to the true solution of (2.1) in the mean square sense with rate given by where C is independent of ∆ and given by where ǫ is such that 0 < ǫ < ∧(q − 1 2 ). In case Assumption B holds, the semi-discrete scheme (5.3) converges to the true solution of (2.1) in the mean square sense with rate given by, and C HK is the constant described in (4.11) and λ is an appropriately chosen positive parameter which satisfies (4.12) and always exist, ν(λ) := λ 2(1−q) 2 (k 3 ) 2 − 1, quantity C k 2 ,k 3 ,θ,∆ is given in Lemma 3.4 and ǫ > 1.
Proof of Theorem 5.1. We will discuss the proof and highlight the difference since one can follow the proofs of Theorems 2.1 and 2.2. First of all note that Lemmata 3.1, 3.2 and 3.3 still hold, i.e. the moment bounds and error bounds of (y SD t ), as well as the moment bounds involving the auxiliary process (h t ) are true. The error E t := h t − x t now reads as As regards the first part of Theorem 5.1, we now have that for any m > 1, where e m = e −m(m+1)/2 and J 3 as stated in Proposition 3.5. The bound (5.7) follows in the same lines as in the proof of Proposition 3.5 where now (3.16) becomes and the term (3.17) disappears. Moreover, we have that where C ǫ is independent of ∆ and given by C ǫ := 32 3 2ǫ (k 3 ) 4 T 2 e 6T 2 k 2 +k 2 T , where ǫ is such that 0 < ǫ < (q − 1 2 ). The bound (5.9) follows in the same lines as in the proof of Proposition 3.6 where now M t := t 0 (g(yû, y u ) − g(x u , x u ))dW u and where now we take 0 < ǫ < (q − 1 2 ). As regards the second part of Theorem 5.1, we follow Section 4, where now we use the process to get the upper bound and then Finally in order to bound E e ǫ128(k 3 ) 2 q 2 T 0 (xs) 2q−2 ds , we require that (5.10) 128(k 3 ) 2 q 2 ǫ ≤ (ν(λ)) 2 K 2 3 8 and can always find a ǫ > 1, such that (5.10) holds yielding , is as given in statement of Theorem 5.1.

Numerical Experiments.
We discretize with a number of steps in power of 2. (6.1) for n = 0, . . . , N − 1, where ∆W n := W t n+1 − W tn are the increments of the Brownian motion which are standard normal r.v's. The ALF (Alfonsi) scheme [1, Section 3] is an implicit scheme which requires solving the nonlinear equation and then computing y ALF t n+1 = (Y n+1 ) 1 1−q . The estimation of Y n+1 in (6.2) can be done for example with Newton's method, but requires a small enough ∆. 2 We also consider a scheme recently proposed in [9] using again the SD method, but in a different way.
for n = 0, . . . , N − 1. Note the similarity in the expressions of (6.3) and the SD scheme (6.1) proposed here. This is not strange, because they both rely in the same way of splitting the drift coefficient. In particular, in the explicit HAL scheme, the following process is considered y HAL t (q) = y tn + f 1 (y tn ) · ∆ + t tn f 2 (y s )ds + t tn sgn(z s )g(y s )dW s , for t ∈ (t n , t n+1 ] with y 0 = x 0 a.s. where now (6.4) f . g(x) = k 3 x q and (6.5) Compare with (2.2) and (2.3) for θ = 0. We write (6.4) again as (6.6) sgn(z s )(y s ) q dW s and the process (6.6) is well defined when Compare with (2.4) for θ = 0. Solving for y t , we end up with y HAL t (q) = |z t | 1 1−q . The main result in [9] is , when (6.7) holds, implying a rate of convergence at least q(q − 1 2 ) which is bigger than the rate of convergence of the SD scheme proposed here which is at least 1 2 (q − 1 2 ) ∧ 1 8 . We aim to show experimentally the order of convergence of all available methods for the estimation of the true solution of the CEV model (2.1), i.e the semi-discrete methods SD method (6.1) and the HAL scheme (6.3) as well as the implicit ALF scheme (6.2). We would also like to reveal the dependence of the order of the semi discrete methods on q. We take the level of implicitness of SD method (6.1) in one case θ = 0 and in the other case θ = 1, i.e. we consider the fully explicit scheme and the fully implicit scheme.
We work as in [11,Section 5], i.e. we estimate the endpoint error ǫ = E|y T − x T |, where x T is the exact solution of (2.1) and y T is the numerical approximation by computing M batches of L simulation paths, where each batch is estimated byǫ j = 1 As a reference solution, we take in the experiments the value of each method at y T at ∆ = 2 −14 . For the SD case, we have shown in Theorem 2.2 that it strongly converges to the exact solution. We plot in a log 2 − log 2 scale and error bars represent 90% confidence intervals. The results are shown in Tables 1 and 2 for the cases q = 0.6 and q = 0.8. We also present for the case q = 0.6 Figure 1.
In Table 3 we present the relation between the error and computational time 3 of implicit SD, HAL and ALF for the same problem with (x 0 , k 1 , k 2 , k 3 , T ) = (1, 2, 2, 1, 1) and q = 0.7.
Finally, Table 4 presents the exact values of order of convergence for SD, HAL and ALF methods, produced by linear regression with the method of least squares fit, in the case one Figure 1. Convergence of fully implicit SD, HAL and ALF schemes applied to SDE (2.1) with parameters (x 0 , k 1 , k 2 , k 3 , q, T ) = (1, 2, 2, 1, 0.6, 1), with 17 digits of accuracy.
SD method here is that although implicit, has an explicit formula and thus requires fewer arithmetic operations and consequently less computational time.
The following points of discussion are worth mentioning.
• All methods are quite close as shown in Table 1 and Figure 1 for the case q = 0.6 Table 2 for the case q = 0.8. In particular, Table 5 shows how close they are w.r.t. the L 1 −norm. We see that the HAL scheme is closer to the implicit ALF scheme compared with the distance of the SD scheme to the ALF scheme. Nevertheless, Table  5 also shows that in order to get an accuracy to at least 2 decimal digits, which in practice may be adequate concerning that we want for example to evaluate an option and thus our results are in euros, there is no actual harm in choosing whatever of the above available methods. We may then choose the fastest one as discussed later on. • We see that the strong order of convergence of implicit SD for problem (2.1) is at least q − 1/2 = 0.05 as shown theoretically and presented in Table 4. We also see that all methods converge with similar orders and the theoretically rate 1 of the ALF method [1] does not hold for these ∆ − s. Thus, again we see that the rate in practical situations does not necessarily matter, if one has to compute with very small ∆ − s to achieve it. Moreover, we present in Tables 6 and 7 the performance of the explicit SD method and see that it is very close to the implicit, which is of course natural to happen. q = 0. 6 Step ∆ 90%−SD-Error(θ = 0) q = 0.8 90%−SD-Error(θ = 0) 2 −5 0.3859359379 ± 1.314 · 10 −2 0.3809479779 ± 1.472 · 10 −2 2 −7 0.3853406566 ± 1.316 · 10 −2 0.3720838208 ± 1.396 · 10 −2 2 −9 0.3810129323 ± 1.318 · 10 −2 0.3825985516 ± 1.194 · 10 −2 2 −11 0.3624898696 ± 1.263 · 10 −2 0.35001844420 ± 1.334 · 10 −2 2 −13 0.2732191558 ± 1.05 · 10 −2 0.2745123158 ± 1.138 · 10 −2 Table 6. The performance of fully explicit SD scheme (6.1) applied to SDE (2.1) with parameter set (x 0 , k 1 , k 2 , k 3 , T ) = (1, 2, 2, 1, 1) for q = 0.6 and q = 0.8.  Table 7. Order of convergence of SD with θ = 0 approximation of (2.1) with (x 0 , k 1 , k 2 , k 3 , T ) = (1, 2, 2, 1, 1), for different values of q.
• In practice, the computer time consumed to provide a desired level of accuracy, is of great importance. As mentioned before, the SD method as well as the HAL method performs well in that aspect, compared to the implicit ALF method, which requires the estimation of a root of a nonlinear equation in each step and is therefore time consuming. This is shown in Table 3 which shows the advantage of the semi-discrete methods SD and HAL. Moreover, the explicit SD, performs even better in that aspect as shown in Table 8 q = 0.7 Step ∆ Time/Path(in sec) of Fully Explicit SD(θ = 0) 2 −5 7.547 2 −7 7.516 2 −9 7.672 2 −11 8.375 2 −13 10.047 Table 8. Error and computational time for a path in seconds for Fully Explicit SD method for q = 0.7.

7.
Comments and Future work.
• Instead of dealing directly with (2.1) we consider the process (v t ) = (x t ) 2−2q which satisfies for t ∈ [0, T ] where K j 's , j = 0, . . . , 3 are as in (4.3). We may try first to approximate process (v t ) with the semi-discrete method, say y SD t , and then transform back to the original SDE (2.1) using x t = (v t ) 1 2−2q . In particular the mean value theorem implies thus by taking the squares of the above inequality and then expectations The "annoying" part of the square diffusion process (v t ) described by (7.1) is the fact that (v t ) is raised to a power 1−2q 2−2q which is negative. • We can work with the case where (2.1) has time varying coefficients, i.e. k 1 (t), k 2 (t), k 3 (t).