DDNS Discrete Dynamics in Nature and Society 1607-887X 1026-0226 Hindawi Publishing Corporation 753025 10.1155/2013/753025 753025 Research Article Terminal-Dependent Statistical Inference for the Integral Form of FBSDE http://orcid.org/0000-0001-9687-1043 Zhang Qi 1,2 Dragan Vasile 1 School of Mathematics Shandong University, Jinan 250100 China sdu.edu.cn 2 College of Mathematics Qingdao University, Qingdao 266071 China qdu.edu.cn 2013 2 11 2013 2013 14 06 2013 04 09 2013 22 09 2013 2013 Copyright © 2013 Qi Zhang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Backward Stochastic Differential Equation (BSDE) has been well studied and widely applied. The main difference from the Original Stochastic Differential Equation (OSDE) is that the BSDE is designed to depend on a terminal condition, which is a key factor in some financial and ecological circumstances. However, to the best of knowledge, the terminal-dependent statistical inference for such a model has not been explored in the existing literature. This paper is concerned with the statistical inference for the integral form of Forward-Backward Stochastic Differential Equation (FBSDE). The reason why I use its integral form rather than the differential form is that the newly proposed inference procedure inherits the terminal-dependent characteristic. In this paper the FBSDE is first rewritten as a regression version, and then a semiparametric estimation procedure is proposed. Because of the integral form, the newly proposed regression version is more complex than the classical one, and thus the inference methods are somewhat different from those designed for the OSDE. Even so, the statistical properties of the new method are similar to the classical ones. Simulations are conducted to demonstrate finite sample behaviors of the proposed estimators.

1. Introduction

The Backward Stochastic Differential Equation (BSDE) was first presented by Bismut  for the linear case and by Pardoux and Peng  for the general case. The solution of a BSDE consists of a pair of adapted processes (Yt,Zt) satisfying (1)-dYt=g(t,Yt,Zt)dt-ZtdBt;    YT=ξ, where g is the generator, Bt is the standard Brownian motion, and ξ is the terminal condition. Usually the terminal condition is designed as a random variable with given distribution. If g meets certain conditions, the BSDE has a unique solution. The integral form of the BSDE can be expressed as (2)Yt=ξ+tTg(s,Ys,Zs)ds-tTZsdBs.

The study history of the BSDE was relatively short but progressed rapidly. In addition to the interesting mathematical nature, its extensive applications gained more and more attentions; see for example Peng , Pardoux and Peng , Pardoux and Tang , Peng and Wu , Ma and Yong , and Nualart and Schoutens . Duffie and Epstein  used the BSDE to describe the consumer preferences under uncertain economic environment (i.e., the stochastic differential utility). El Karoui and Quenez  stated that in financial markets, prices of many important derivative securities could be solved by a certain BSDE. Lin et al.  used an extended statistical model to describe an ecological problem. Furthermore, the BSDE is closely related to nonlinear partial differential equation, more generally, the inseparability of nonlinear semigroup or stochastic control problems. Meanwhile, this type of equation appears frequently in mathematical finance as pointed out by Quenez . Recently, Delong  introduced the most recent advances in BSDE (including FBSDE) and applied BSDE with jumps to insurance and finance fields.

In terms of the backward equation, within a complete market it serves to characterize the dynamic value of replicating portfolio Yt with a final wealth ξ and a special quantity Zt that depends on the hedging portfolio. Especially when the randomness of (g,ξ) of BSDE comes from the state of the forward equation, the corresponding equation is proved to be a Forward-Backward Stochastic Differential Equation (FBSDE), which can be expressed as (3)dYt=-g(Xt,Yt,Zt,t)dt+ZtdBt;YT=ξ, with Xt satisfying (4)dXs=μ(s,Xs)ds+σ(s,Xs)dBs. Compared to the Ordinary Stochastic Differential Equation (OSDE) that contains an initial condition, the solution of the FBSDE is affected by the terminal condition YT=ξ(XT). As is well known, there exist a number of parametric and nonparametric methods to deal with estimation and test for the OSDE. However, these methods can not be directly employed to infer the BSDE and FBSDE because the two models are related to a terminal condition.

For the FBSDE defined above, the statistical inference was investigated initially by Su and Lin , Chen and Lin , and a relevant model which was proposed by Lin et al. . However, they did not take the terminal condition into account in the inference procedure. In the framework of the FBSDE mentioned above, the terminal condition is additional, which is not nested into the equation. Thus, there is an essential difficulty to use the terminal condition to refine the inference procedure. As a result, their methods fail to cover the full problems given in the FBSDE.

As well the FBSDE could turn to the integral form: (5)Yt=ξ+tTg(s,Xs,Ys,Zs)ds-tTZsdBs,dXt=μ(t,Xt)dt+σ(t,Xt)dBt;X0=x. In this paper I focus only on the integral form because it contains the terminal condition as an additive term of the equation. With such a construction, a terminal-dependent inference could be built. I am concerned with the semiparametric estimation of the FBSDE in this paper. Note that Zt is usually unobservable and g can not be completely specified in the financial market. The problems of interest are therefore to give both proper estimations of the generator g and the process Zt based on observed data (Xt,Yt) and the terminal condition ξ. As an initial investigation, this paper only considers the model with generator being parametric structure; that is to say, g can be written in the form of g=g(θ,t,Yt,Zt), where θ is an unknown parameter vector. Even so, such a simplified form is widely used in financial markets, and, furthermore, the proposed methods can be extended to the other complicated forms.

It is worth mentioning that the key point of the method is the use of the integral equation rather than the differential equation. This change leads to a completely new work among the existing researches. Unlike the forward equation, because of the integral, the cumulative error appears not neglectable; nevertheless, the resultant estimation is still asymptotically unbiased for the condition of mixing dependency of Xt attached. Another difference from the ordinary model is that the generator contains the unobservable process Zt, and then it is necessary to estimate Zt first. After plugging the estimator of Zt into the generator, I could infer generator g with the newly proposed methods.

The paper is organized as follows. In Section 2, the FBSDE is first rewritten as a special regression, and, by this representation, the estimation procedure for the FBSDE with linear generator is designed. Next I discuss the asymptotic properties in Section 3. A supplement for the inference of equation is suggested, and an extension for nonlinear model is briefly discussed in Section 4. Simulation study is proposed in Section 5 to illustrate the methods. The proofs of the theorems are presented in Section 6.

2. Terminal-Dependent Semiparametric Estimation for the FBSDE 2.1. Model and Its Statistical Version

I consider the integral form of the standard FBSDE: (6)Yt=ξ+tTg(s,Xs,Ys,Zs)ds-tTZsdBs,dXt=μ(t,Xt)dt+σ(t,Xt)dBt;X0=x, where {Bt}t0 is the Brownian motion and ξ is a smooth function. Here the generator g is a function of t, Yt, and Zt, with Zt being usually unobservable. Furthermore, the adapted process Yt,Zt and terminal condition could be indicated as a function of Xt. As is known to everyone, the existence and uniqueness result of the FBSDE have been studied elaborately. This section is intended to represent the FBSDE as a statistical framework and then address the proper estimators of g and Zt based on observed data {Xt,Yt} and the terminal condition ξ.

To recast the model (6) as a statistical model, I first examine the property of the last term of the first equation in (6). By the property of Itô integral and the relation between the two equations in (6), I have (7)E(tTZsdBs)=0,Var(tTZsdBs)=E(tTZs2ds). Then I regard -tTZsdBs as error and consequently rewrite the first equation of model (6) as (8)Yt=ξ+tTg(s,Ys,Zs)ds+ϵt, where ϵt is the error term with mean zero and bounded variance, and the adapted process Yt,Zt and terminal condition ξ depend on Xt via the second equation of (6).

Remark 1.

It seems that formula (8) proposes a regression that is determined by both expectation and variance frameworks. However, such a regression is quite unlike the classical one. In the newly defined structure, although the expectation of the error is zero, the conditional expectation of the error is nonzero. Even so, the resultant estimation is asymptotically unbiased, and thus the consistency of the estimators defined below still holds because of the condition of mixing dependency of Xt given below; for details see the following theorems and the proofs of the theorems.

Given the initial calendar time point t1, I record the observed time series data {X(ti),Y(ti),i=1,,n} at the equally spaced time points {ti=t1+(i-1)Δ,i=1,,n}[0,T]. Denote Δi=ti+1-ti(=Δ) for 1in-1 and Δn=T-tn. Note that Δn is the distance between the last observation time tn and the terminal time T; indeed it may be quite large and then makes the following formula (9) inaccurate. Therefore I first assume Δn small enough, that is, Δn=O(Δ), and then propose an adjustment in Section 4 for the case with larger Δn. On the other hand, since the distribution of ξ is supposed known, I can get the samples {ξi,1ik} for k1/Δ.

In this section I assume g can be expressed as linear function g(t,Yt,Zt)=a+bYt+cZt, where a, b, and c are unknown parameters. Then the model (8) can be approximately rewritten as (9)Y(ti)=ξ^+j=in(a+bY(tj)+cZ(tj))Δj+O(Δ)+ϵi+νs,E(ϵi)=0,Var(ϵi)=j=inΔjZ2(tj)+O(Δ),i=1,,n, where ξ^=(1/k)i=1kξi and νs=ξ-ξ^, satisfying E(νs)=0 and Var(νs)=Var(ξ)(1+(1/k)).

This is the statistical version of (8), a new regression model. It is worth mentioning that the new model (9) is somewhat different from the classical regression; that is, in addition to the mean-variance structure, the new one has a complicated structure and contains a terminal information.

2.2. Semi-Parametric Estimation for the FBSDE

I now turn to estimating unknown parameter vector β=(a,b,c)τ in model (9). While the generator contains unobservable interesting process Zt, it is necessary to estimate Z for plugging the estimator into the generator firstly. After that, the common parametric estimation methods can be employed to estimate parameters.

Concerning inference of Zt, despite the connection between Zt and the variance of ϵt in (9), the second formula of (9) is related to the weighted sum of Zt2, which causes inconvenience for estimating Zt by residual-based method. I now adopt a difference-based method instead.

To this end, consider the FBSDE model (6), motivated by Stanton , for the Markov process {Xs}tsT which follows the SDE, (10)dXs=μ(s,Xs)ds+σ(s,Xs)dBs; the infinitesimal generator f(t,Xt) is defined as (11)f(t,Xt)=limΔ0+1Δ{E[f(t+Δ,Xt+Δ)Xt=x]-f(t,x)}=ft+μ(t,x)f(t,x)x+12σ2(t,x)2f(t,x)x2, where the bivariate function f(·,·) satisfies the sufficient smoothing condition . By Taylor's expansion, the condition expectation E[f(t+Δ,Xt+Δ)Xt=x] can be expressed as (12)E[f(t+Δ,Xt+Δ)Xt=x]=f(t,x)+f(t,x)Δ+122f(t,x)Δ2++1n!nf(t,x)Δn+O(Δn+1), which implies that when the time increment Δ0, the first-order approximation formula for f(t,x) can be given by (13)f(t,x)=1ΔE[f(t+Δ,Xt+Δ)-f(t,Xt)Xt=x]+O(Δ). In addition, we need the following generalized Feynman-Kac formula. Let υ be a 𝒞1,2 function, and suppose there exists a constant C such that, for each (t,x), (14)|υ(t,x)|+|xυ(t,x)σ(t,x)|C(1+|x|), and υ is the solution of the following system of quasilinear parabolic partial differential equation (15)υ(t,x)+f(t,x,υ(t,x),xυ(t,x)σ(t,x))=0,υ(T,x)=ϕ(x). Then Yst,x=υ(s,Xst,x),Zst,x=xυ(s,Xst,x)σ(s,Xst,x), a.s., where (Yst,x,Zst,x) is the unique solution of the FBSDE, based on the result in Pardoux and Peng .

Denote (Yst,x,Zst,x) by (Ys,Zs), respectively, for short. By using the Taylor's expansion of f(s,x)=(υ(s,x)-υ(t,Xt))2, (16)f(s,x)=2(υ(s,x)-υ(t,Xt))υ(s,x)+υx2(s,x)σ2(s,x); then f(t,Xt)=υx2(t,Xt)σ2(t,Xt), and (17)f(t,Xt)=1ΔE[(υ(t+Δ,Xt+Δ)-υ(t,Xt))2Xt=x]+O(Δ). Finally an approximation of Zt2 could be expressed as (18)Zt2=υx2(t,Xt)σ2(t,Xt)=f(t,Xt)=1ΔE[(υ(t+Δ,Xt+Δ)-υ(t,Xt))2Xt=x]+O(Δ); that is, (19)Zt2=1ΔE[(Yt+Δ-Yt)2Xt=x]+O(Δ).

By (19), I regard Z2(x0) as point-wise nonparametric regression function. For simplicity, here the N-W kernel estimator is taken as an example of nonparametric smooth estimators: (20)Z^2(x0)=i=1n-1Δ-1(Y(ti+1)-Y(ti))2Kh(X(ti)-x0)i=1n-1Kh(X(ti)-x0), where Kh(·)=K(·/h)/h, K(·) is the kernel function satisfying the regularity condition given below and h is the bandwidth or smoothing parameter. Similarly, if Zt also depends on t besides Xt, the corresponding estimator could be (21)Z^2(x0,t0)=i=1n-1Δ-1(Y(ti+1)-Y(ti))2KhX(X(ti)-x0)Kht(ti-t0)i=1n-1KhX(X(ti)-x0)Kht(ti-t0). Since having calculated Z^t2, I plug it in the first formula of (9), obtaining (22)Y(ti)ξ^+j=in(a+bY(tj)+cZ^(tj))Δj+O(Δi)+ϵi+νs,i=1,,n.

From the above, it is simple to deduce the estimator of β=(a,b,c)τ with common parametric methods, the least square method for example, by minimizing (23)i=1n(Y(ti)-ξ^-j=in(a+bY(tj)+cZ^(tj))Δtj)2. For simplicity, denote (24)V=1T(Y(t1)-ξ^Y(tn)-ξ^),U=1T(j=1nΔjj=1nY(tj)Δjj=1nZ(tj)ΔjΔnY(tn)ΔnZ(tn)Δn),U^=1T(j=1nΔjj=1nY(tj)Δjj=1nZ^(tj)ΔjΔnY(tn)ΔnZ^(tn)Δn). Finally, I can write the estimator as (25)β^=(U^τU^)-1U^τV.

3. Asymptotic Results

The following two theorems are concerned with asymptotic properties of the estimators deduced in the previous section.

First of all, I lead in several conditions.

X1,,Xn are ρ-mixing dependent; namely, the ρ-mixing coefficients ρ(l) satisfy ρ(l)0 as l, where (26)ρ(l)=supE(Xi+lXi)-E(Xi+l)E(Xi)0|E(Xi+lXi)-E(Xi+l)E(Xi)|Var(Xi+l)Var(Xi)

with Xi=X(ti).

|Zi|C (a.s.) uniformly for i=1,,n, where C is a positive constant and Zi=Z(ti).

The continuous kernel function K(·) is symmetric about 0, with a support of interval [-1,1], and (27)-11K(u)du=1,σK2=-11u2K(u)du0,-11|u|jKk(u)du<for  jk=1,2.

As n, (28)1nUτUpΣ,1nT2UτUVar(ϵ)pΩ,

where the matrix Σ is nonsingular and satisfies (29)0<C1<λmin(Σ)<λmax(Σ)<C2<,

with λmin(Σ) and λmax(Σ) being the smallest and largest eigenvalues of Σ, respectively.

The condition (a) is commonly used for the weakly dependent process; see for example Rosenblatt [18, 19], Kolmogorov and Rozanov , Bradley and Bryc , Lin and Lu , and Su and Lin . The condition (b) is also reasonable because, as is shown by (19), Zt can be regarded as the deviation between the adjacent two observations. The condition (c) is standard for nonparametric kernel function, and the condition (d) is obviously common because it describes the property of average. Furthermore, as remarked in the previous section, to express the estimator related to Xt rather than model variables Yt and Zt, I apply conditions mainly on the latent variable Xt, including the stationary ρ-mixing Markov character used in the following theorems. Actually the process {Yt} may be unstationary.

Theorem 2.

Besides the conditions (a), (b), and (c), suppose that Xi(x0-h,x0+h) is a stationary ρ-mixing Markov process with the ρ-mixing coefficients satisfying ρ(l)=ρl for 0<ρ<1 and has a common probability density p(x) satisfying p(x0)>0, Z2(x0)>0. Furthermore, functions p(x) and Z(x) have continuous two derivatives in a neighborhood of x0. As n, if nh, nh50 and nhΔ20, then (30)(n-1)h(Z^2(x0)-Z2(x0))𝒟N(0,Z4(x0)JKp(x0)).

The proof is presented in Section 6. The asymptotic result in the theorem is standard for nonparametric kernel estimator, and here undersmoothing is used to eliminate asymptotic bias.

Theorem 3.

In addition to the condition of Theorem 2, if the condition (d) holds, then as n, (31)n(β^-β)dN(0,σ2Σ-1+Σ-1ΩΣ-1), where σ2=Var(ξ/T).

The proof is also presented in Section 6. The result is eventually standard in the sense of asymptotic normality with the convergence rate of order n. As was shown in the remark given in the previous section, even the conditional mean of error of the model is nonzero, the newly proposed estimation is consistency because of the mixing dependency; for details see the proof of Theorem 3. Furthermore, because of the terminal condition, the asymptotic variance is larger than that without use of the terminal condition.

4. Supplement and Extension 4.1. Supplement

As is mentioned in Section 2.1, when the last observation tn is far away from the terminal T, the new model (9) appears inaccurate. In this case I need an adjustment to obtain a relatively accurate model. The main steps of adjustment are defined as follows: first I ignore the terminal condition to obtain both the accurate model and parameters estimations limited in (0,tn); next I estimate the unobservable variables in the interval (tn,T) by the first step estimated model; finally, I substitute the estimators for the unobservable variables in (tn,T) and build a relatively accurate model defined in the whole interval [0,T] and related to the terminal condition.

For arbitrary ttn, (32)Yt=Yn+ttng(s,Ys,Zs)ds-ttnZsdBs. This equation is accurate and thus I can get the estimators g^ and Z^t of g and Zt for ttn by the methods given in Section 2. When t>tn, this method is however unsuitable for estimating Zt because it cannot be extrapolated to the interval (tn,T), so I attempt to complete the data within this interval.

Set tn<tn+1<<tn+l<T. Discretize model (32) and write its forward linear version as (33)Y(ti+1)=-((b-1Δi)Y(ti)+cZ(ti))Δi+Z(ti)(Bi+1-Bi). Similar to formula (9), the expectation-variance structure is shown as (34)E(Y(ti+1)Y(ti),Z(ti))=-((b-1Δi)Y(ti)+cZ(ti))Δi,Var(Y(ti+1)Y(ti),Z(ti))=ΔiZ2(ti). To estimate the unobservable data (Y(ti),Z(ti)) for i=n+1,,n+l, I treat Zt as being parameterizable. It is known by, for example, Morris  that variance can be expressed as the quadratic function of mean for several common distributions, such as normal, gamma, binomial, negative binomial, and Poisson. For the mean-variance structure in (34), I might as well suppose the following parametric structure: (35)ΔiZ2(ti)=γ1+γ2((b-1Δi)Y(ti)+cZ(ti))Δi+γ3((b-1Δi)Y(ti)+cZ(ti))2Δi2, for some parameters γk. By simply transforming and neglecting O(Δ2) terms, I see (36)Z(ti)=ω1+ω2ω3+ω4Y(ti), where ω2=1 or -1, and denote ω=(ω1,ω2,ω3,ω4).

Let Z-(ti)=ω1+ω2ω3+ω4Y(ti)(1in) and plug Z-(ti) into (32). I then could get the estimators through the methods in Section 2; denote by b~S, c~S, and ω~ the estimators of b, c and ω, respectively. Finally I could refine the original orbit and estimate Z(tn+1) one by one, more precisely, (37)Y~(tn+1)=-((b~S-1Δi)Y(tn)+c~SZ~(tn))Δn,Z~(tn+1)=ω~1+ω~2ω~3+γ~4Y~(tn+1). Iterating the above procedures, I obtain the complete data in [0,T]. Consequently, the same approaches as in Section 2 could be performed again, and a refined estimator of g could be constructed.

4.2. Extension

Consider that the semiparametric models in Section 2 are of linear structure in the sense that g is linearly related to parameters a, b, and c. However, some generators are nonlinear in parameters; thus the resulting model (9) will be nonlinear. For example Constantinides  presented the resulting model with the specification form: (38)dXt=(α0+α1Xt+α2Xt-α3)dt+σ0(Xt-α3)dBt. See for other examples Fan , Fan and Zhang , Chan et al. , and Aït-Sahalia .

Then, for the flexibility of modeling the above case, a nonlinear semiparametric model can be defined as (39)Yt=ξ+tTg(θ,s,Ys,Zs)ds+ϵt,Var(ϵt)=E(tTZs2ds), where ϵt=tTZsdBs satisfying E(ϵt)=0, g is a given function, and θ is an unknown p-dimensional parameter vector.

Before estimating nonlinear model (39), Zt can be estimated similarly by (19) or (20) because its estimator is free of the structure of g. Furthermore, the resulting estimator has the same asymptotic properties as in Theorem 2. Thus I only focus on the estimation of parameter vector θ here.

After plugging the estimator Z^t of Zt into the first formula of (39), I can adopt a common method to obtain an estimator of θ, for example, by minimizing (40)Q^(θ)=i=1n(Yi-ξ^-j=ing^j(θ)Δtj)2, where g^j(θ)=g(θ,tj,Yj,Z^j). Under regularity conditions, I can also get θ^ by solving the following equation: (41)L^(θ)=Q^(θ)=-2i=1n[(Yi-ξ^-j=ing^j(θ)Δtj)j=ing^j(θ)Δtj]=0, where Q^(θ) denotes the derivative of Q^(θ). By the similar arguments used in the previous section, the resultant estimator is normally distributed; the details are omitted here.

5. Simulations

In this section I investigate the finite-sample behaviors by simulation. Despite Theorems 2 and 3 based on stationarity of Xt, I also extend this method to nonstationary process such as Geometric Brownian Motion. I use the mean, standard deviation (STD) or mean square error (MSE) to evaluate the estimations, based on 300 repetitions. Apparently, the model with stationary condition will work better.

Example 4.

Consider Cox-Ingersoll-Ross (CIR) process: (42)dXt=k(θ-Xt)dt+σXt1/2dBt;X0=x0. This model describes the interest rate dynamic system and is stationary when 2kθσ2. On the other hand, the riskless asset with price per unit Pt is conducted as follows: (43)dPt=rP0dt, with r being the constant short rate. Let n0(t) and n1(t) denote the quantities invested in bond Pt and asset Xt, respectively. Naturally the total wealth process Yt satisfies Yt=n0(t)Pt+n1(t)Xt. Similar with the classic self-financing FBSDE model in El Karoui et al. , the resulting model is (44)dYt=(n1kθ+rYt+k+rn1σ2Zt2)dt+ZtdBt;    YT=ξ(XT),dXt=k(θ-Xt)dt+σXt1/2dBt;X0=x0.

Denote parameters a=n1kθ, b=r, and c=(k+r)/n1σ2. I put the equal length Δ=0.4 of time period and choose sample size n=300. So the time interval is [0,120]. The terminal time is chosen as 122, which is quite near the former one. Let k=0.2,θ=0.06, σ=0.08, r=0.05, and n0=n1=10. In the estimation procedure, I use the Gaussian kernel defined by K(t)=(1/2π)exp(-t2/2); meanwhile the optimal bandwidths would be O(n-1/5) theoretically, and popular data-driven method can also be used, such as CV, GCV, or plug-in approach. In the simulation, I set h=std(x)n-1/5 for simplicity. The simulation results with other choices are similar.

I present the true curves and the N-W nonparametric estimation curves for Zt and generator g and report the mean and MSE of estimator β^ of β(a,b,c), respectively, in Table 1. These results show that the estimators of a and b work well. However, because of the plug-in estimator Z^t, the estimator of the coefficient c has fairly large bias and the MSE. On the other hand, Figure 1 shows that the estimation curves of drift and diffusion are closed to the true ones.

Parameter True value Mean MSE
a 0.12 0.1404 0.0089
b 0.05 0.0503 1.2694 e - 007
c −3.9062 −4.5826 4.6295

The real lines are the true curves of Zt and function g respectively, and the dashed ones are estimated curves for them in Example 4.

Example 5.

In this part I consider the case that the terminal time is far away from the last observed time, as mentioned in the Supplement Section. The distance between T=124 and tn=120 is larger than that in Example 4. I add 10 estimated points by the method given in Section 4 and employ the same model and parameters as before. Table 2 reports the simulation results, which tells us that the parameter estimators do not perform as well as before but still feasible. Besides, Figure 2 presents the estimated curves for Zt and g, which also perform well although they are not better than the estimations in Example 4.

Parameter True value Mean MSE
a 0.12 0.1470 0.0319
b 0.05 0.0490 1.6873 e - 006
c −3.9062 −3.8360 9.2929

The real lines are the true curves of Zt and function g(t), respectively, and the dashed ones are estimated curves for them in Example 5.

Example 6.

I turn to the nonstationary case in this part. Obviously when forward process Xt does not satisfy the stationary condition, this cumulate effect induced by backward addition performs more significantly, which makes the statistical inference quite a challenge. Under this situation, I choose certain model and parameters to control the relative stationarity.

I consider a simple FBSDE as (45)dYt=(μ-rσZt+rYt)dt+ZtdBt(bYt+cZt)+ZtdBt;YT=ξ(XT), where Xt is Geometric Brownian Motion for modeling stock price satisfying (46)dXt=μXtdt+σXtdBt;X0=x, while the riskless asset is the same as formula (43), dPt=rP0dt.

Firstly, let μ=0.1, σ=0.01, Δ=0.12, n=300, T=36.6, and n0=n1=10. Obviously Zt=n1σXt. I choose the same pattern kernel function and bandwidth h=std(x)n-1/5. Table 3 reports the simulation results. The results show that the estimators of b work well, but c have larger bias and the STD because of the plug-in estimator Z^t. While the curves can still be fitted well, that is, the estimated curves of drift and diffusion are closed to the true ones, Figure 3 presents the estimated curves for diffusion Zt and drift g by one simulation.

Finally, I choose σ relatively large as 0.05 and 0.12, which display different extension of volatilities. From Tables 4 and 5 and Figures 4 and 5, I can see that their performances are not so bad, which means that the approach could be applied more widely.

Parameter True value Mean STD
b 0.05 0.0473 0.0348
c 5 5.7788 7.3388
Parameter True value Mean STD
b 0.05 0.0580 0.0217
c 1 0.9959 7.6034
Parameter True value Mean STD
b 0.05 0.0557 0.0421
c 0.4167 0.4227 6.5970

The real lines are the true curves of Zt and function g(t), respectively, and the dashed ones are estimated curves for them in Example 6.

The real lines are the true curves of Zt and function g(t), respectively, and the dashed ones are estimated curves for them with σ=0.05.

The real lines are the true curves of Zt and function g(t), respectively, and the dashed ones are estimated curves for them with σ=0.12.

6. Proofs Proof of Theorem <xref ref-type="statement" rid="thm3.1">2</xref>.

Denote 𝒞={X1,,Xn,}. By the Taylor expansion and formula (19), I have (47)E(Z^2(x0)𝒞)=i=1n-1Δ-1Kh(Xi-x0)E((Yi+1-Yi)2𝒞)i=1n-1Kh(Xi-x0)=i=1n-1Kh(Xi-x0)(Zi2+O(Δ))i=1n-1Kh(Xi-x0)=(Kh(Xi-x0)(Z2(x)+O(Δ))p(x)dx×Kh(Xi-x0)(Z2(x)+O(Δ))(1+Op(nh)-1/2))(Kh(Xi-x0)p(x)dx×(1+Op(nh)-1/2)Kh(Xi-x0)(Z2(x)+O(Δ)))-1=((Z2(x0)+O(Δ))(p(x0)+12h2p(2)(x0)σK2+o(h2))×(1+Op(nh)-1/2)(p(x0)+12h2p(2)(x0)σK2+o(h2))(Z2(x0)+O(Δ))(p(x0)+12h2p(2)(x0)σK2+o(h2)))×((p(x0)+12h2p(2)(x0)σK2+o(h2))×(1+Op(nh)-1/2)(p(x0)+12h2p(2)(x0)σK2+o(h2)))-1=Z2(x0)+p(2)(x0)2p(x0)h2Z2(x0)σK2+o(h2)+O(Δ). Furthermore, (48)Var(Z^2(x0)𝒞)=1i=1n-1Kh2(Xi-x0)×{i=1n-1Δ-2Kh2(Xi-x0)Var((Yi+1-Yi)2𝒞)+i=1n-1k=1n-iΔ-2cov(Kh(Xi-x0)(Yi+1-Yi),Kh(Xi+k-x0)(Yi+k+1-Yi+k)𝒞)i=1n-1}. From the conditions of Markov process and ρ-mixing coefficient, (49)|i=1n-1k=1n-iΔ-2cov(Kh(Xi+k-x0)(Yi+k+1-Yi+k)Kh(Xi-x0)(Yi+1-Yi),Kh(Xi+k-x0)(Yi+k+1-Yi+k))i=1n-1k=1n-i|=1(n-1)2×i=1n-1k=1n-i|E((Δ)-2(Yi+1-Yi)2(Yi+k+1-Yi+k)2×(Kh(Xi-x0)-E(Kh(Xi-x0)))×|E((Δ)-2(Yi+1-Yi)2(Yi+k+1-Yi+k)2(Kh(Xi+k-x0)-E(Kh(Xi+k-x0))))|=1(n-1)2|E(Zi2Zi+l2(Kh(Xi-x0)-E(Kh(Xi-x0)))×Zi2Zi+l2(Kh(Xi+k-x0)-E(Kh(Xi+k-x0))))|+O(Δ)C(n-1)2hi=1n-1k=1n-iρk=O(1nh)=o(1).

Note that (Yi+1-Yi)/Δ=g(ti,Yi,Zi)Δ+Ziηi, where E(ηi)=0, Var(ηi)=1. Thus Var((Yi+1-Yi)/Δ)=Zi4+O(Δ) and furthermore (50)Var(Z^2(x0)𝒞)=i=1n-1Δ-2Kh2(Xi-x0)Var((Yi+1-Yi)2𝒞)i=1n-1Kh2(Xi-x0)+Op(1)=i=1n-1Kh2(Xi-x0)(Z4(x0)+O(Δ))i=1n-1Kh2(Xi-x0)+Op(1)=Z4(x0)JK+O(Δ)nhp(x0)(1+Op(nh)-1/2). To my interest, both the conditional expectation and variance are independent of 𝒞, so the condition could be erased.

From Lemma 1 of Politis and Romano  and the relation between the α-mixing condition and the ρ-mixing condition (e.g., Theorem 1.1.1 of ), I can ensure that {(Yi+1-Yi)2,i=1,,n-1} is a ρ-mixing-dependent process and the mixing coefficient, denoted by ρY(l), satisfies (51)k=1ρY(2k)Ck=1ρ(2k)=k=1ρ2k<, where C is a positive constant. Finally, I use the Central Limit Theorems for ρ-mixing-dependent process (e.g., Theorem 4.0.1 of ) to complete this proof.

Proof of Theorem <xref ref-type="statement" rid="thm3.2">3</xref>.

I present the basic results for (1/n)(U^-U)τU, which leads to rate of convergence and asymptotic expansions. Similar to Cui et al.  or Su and Lin , I need the following decomposition: (52)(1nUτU)-1-(1nU^τU^)-1=-(1nUτU)-1{1n(U^-U)τU}(1nUτU)-1-(1nUτU)-1{1nUτ(U^-U)}(1nUτU)-1-(1nUτU)-1{1n(U^-U)τ(U^-U)}(1nUτU)-1+{(1nU^τU^)-1-(1nUτU)-1}{1nU^τU^-1nUτU}  ×(1nUτU)-1I1+I2+I3+I4. By the condition (d), (1/n)UτUpΣ and all eigenvalues of Σ are bounded. Furthermore, by the uniform weak consistency kernel estimator of mixing-dependent variables, (see, e.g., [32, 33]), I have (53)supx|1n-1i=1n-1Kh,i-p(x)|=Op(h2+(log(nh)nh)1/2). By the proof of Theorem 2, I have (54)E(1n(U^-U)τU)=O(h2),Var(1n(U^-U)τU)=O(1(n2h)1/2). It is easy to deduce that I1=op(1/n), I2=op(1/n), I3=op(1/n), and I4=op(1/n). Then I naturally get (55)(1nU^τU^)-1=(1nUτU)-1+op(1n). Denote I5=((1/n)UτU)-1(1/n)(U^-U)τV. Obviously I5=op(1). Combining the results above, I can see that (56)nβ^I=n  (U^τU^)-1U^τV=[(1nUτU)-1+op(1n)]1nU^τV=(1nUτU)-11nU^τV+op(1)=(1nUτU)-11nUτV+I5+op(1)=(1nUτU)-11nUτV+op(1)=nβ+(1nUτU)-11Tni=1nUi(ϵj+νs)+op(1)=nβ+(1nUτU)-11Tni=1nUij=inΔj1/2Zjεj+(1nUτU)-11Tni=1nUiνs+op(1), where {εi} is an unobservable sequence of independent identical distribution random variables with mean zero and variance one.

From E(νs)=0 and the central limit theorem it follows that (57)1Tni=1nUiνsdN(0,σ2Σ).

On the other hand, the expectation of 1/(Tn)i=1nUij=inΔtj1/2Z(tj)εj does not converge to zero; that is to say, I have (58)1TnE(i=1n[(j=inΔtj)(k=i-1nΔtk1/2Z(tk)εk)]i=1n[(j=inΔtjY(tj))(k=i-1nΔtk1/2Z(tk)εk)]i=1n[(j=inΔtjZ(tj))(k=i-1nΔtk1/2Z(tk)εk)])=Tn4(0i=1nj=i-1nk=1n-jE(Y(tj)Z(tj+k)εj)i=1nj=i-1nk=1n-jE(Z(tj)Z(tj+k)εj))=υ, while υ0. For simplicity, take the third competent as an example to estimate the value. From Lemma 1 of Politis and Romano , I can see that Yt, Zt are both ρ-mixing-dependent process and the mixing coefficient denoted by ρ(l), too, (59)1TnE×(i=1n[(j=inΔtjZ(tj))(k=i-1nΔtk1/2Z(tk)εk)])=Tn4E(i=1nj=i-1nk=1n-jZ(tj+k)Z(tj)εj)=Tn4i=1nj=i-1nk=1n-jE(Z(tj)Z(tj+k)εj)=O(1)Tn4i=1nj=i-1nk=1n-jρk(*). I can easily verify that, as n, 0<c1ρ(*)c2(ρ/(1-ρ)).

By condition (d), the variance is bounded uniformly for i=1,,n. Then (60)(1nUτU)-1{1Tni=1nUij=inΔtj1/2Z(tj)ε(tj)-υ}dN(0,Σ-1ΩΣ-1). Summing up the above and the independency between εj and νs, I get (61)n(β^-β-n(UτU)-1υ)dN(0,σ2Σ-1+Σ-1ΩΣ-1). While the asymptotic bias n(UτU)-1υ=o(Σ-1υ)0, therefore (62)n(β^-β)dN(0,σ2Σ-1+Σ-1ΩΣ-1). This completes the proof.

Acknowledgments

This paper was supported by NBRP (973 Program 2007CB814901) of China, NNSF Project (10771123) of China, RFDP (20070422034) of China, and NSF Projects (Y2006A13 and Q2007A05) of Shandong Province of China.

Bismut J. M. Conjugate convex functions in optimal stochastic control Journal of Mathematical Analysis and Applications 1973 44 384 404 MR0329726 10.1016/0022-247X(73)90066-8 ZBL0276.93060 Pardoux É. Peng S. G. Adapted solution of a backward stochastic differential equation Systems & Control Letters 1990 14 1 55 61 10.1016/0167-6911(90)90082-6 MR1037747 ZBL0692.93064 Peng S. G. Probabilistic interpretation for systems of quasilinear parabolic partial differential equations Stochastics and Stochastics Reports 1991 37 1-2 61 74 MR1149116 10.1080/17442509108833727 ZBL0739.60060 Pardoux É. Peng S. Backward stochastic differential equations and quasilinear parabolic partial differential equations Stochastic Partial Differential Equations and Their Applications 1992 176 Springer 200 217 10.1007/BFb0007334 Pardoux É. Tang S. Forward-backward stochastic differential equations and quasilinear parabolic PDEs Probability Theory and Related Fields 1999 114 2 123 150 10.1007/s004409970001 MR1701517 ZBL0943.60057 Peng S. Wu Z. Fully coupled forward-backward stochastic differential equations and applications to optimal control SIAM Journal on Control and Optimization 1999 37 3 825 843 10.1137/S0363012996313549 MR1675098 ZBL0931.60048 Ma J. Yong J. Forward-Backward Stochastic Differential Equations and Their Applications 1999 1702 Berlin, Germany Springer xiv+270 Lecture Notes in Mathematics MR1704232 Nualart D. Schoutens W. Backward stochastic differential equations and Feynman-Kac formula for Lévy processes, with applications in finance Bernoulli 2001 7 5 761 776 10.2307/3318541 MR1867081 ZBL0991.60045 Duffie D. Epstein L. G. Stochastic differential utility Econometrica 1992 60 2 353 394 10.2307/2951600 MR1162620 ZBL0768.90006 El Karoui N. Quenez M. C. Dynamic programming and pricing of contingent claims in an incomplete market SIAM Journal on Control and Optimization 1995 33 1 29 66 10.1137/S0363012992232579 MR1311659 ZBL0831.90010 Lin L. Li F. Zhu L. Härdle W. K. Mean volatility regressions SFB 649 Economic Risk Berlin, Germany, 2010 Quenez M. C. Méthodes de contrôle stochastique en finance [Thèse de doctorat] 1993 Université Pierre et Marie Curie Delong Ł. Backward Stochastic Differential Equations With Jumps and Their Actuarial and Financial Applications 2013 New York, NY, USA Springer EAA 10.1007/978-1-4471-5331-3 Su Y. Lin L. Semi-parametric estimation for forward-backward stochastic differential equations Communications in Statistics 2009 38 11 1759 1775 10.1080/03610920802531330 MR2542965 ZBL1173.62061 Chen X. Lin L. Nonparametric estimation for FBSDEs models with applications in finance Communications in Statistics 2010 39 14 2492 2514 10.1080/03610920903046816 MR2755589 ZBL1201.62041 Stanton R. A nonparametric model of term structure dynamics and the market price of interest rate risk The Journal of Finance 1997 52 5 1973 2002 10.1111/j.1540-6261.1997.tb02748.x Øksendal B. Stochastic Differential Equations 2003 6th Springer, New York, NY, USA Universitext Rosenblatt M. A central limit theorem and a strong mixing condition Proceedings of the National Academy of Sciences of the United States of America 1956 42 1 43 47 MR0074711 ZBL0070.13804 Rosenblatt M. Density estimates and Markov sequencef Selected Works of Murray Rosenblatt 2011 Springer, New York, NY, USA 240 Kolmogorov A. N. Rozanov Y. A. On strongmixing conditions for stationary gaussian processes Theory of Probability & Its Applications 1960 5 2 204 208 Bradley R. C. Bryc W. Multilinear forms and measures of dependence between random variables Journal of Multivariate Analysis 1985 16 3 335 367 10.1016/0047-259X(85)90025-9 MR793497 ZBL0586.62086 Lin Z. Lu C. Limit Theory for Mixing Dependent Random Variables 1996 378 Springer, New York, NY, USA Mathematics and its Applications MR1486580 Morris C. N. Natural exponential families with quadratic variance functions The Annals of Statistics 1982 10 1 65 80 Constantinides G. M. A theory of the nominal term structure of interest rates The Review of Financial Studies 1992 5 4 531 552 Fan J. A selective overview of nonparametric methods in financial econometrics Statistical Science 2005 20 4 317 357 10.1214/088342305000000412 MR2210224 ZBL1130.62364 Fan J. Zhang C. A. A reexamination of diffusion estimators with applications to financial model validation Journal of the American Statistical Association 2003 98 461 118 134 10.1198/016214503388619157 MR1965679 ZBL1073.62571 Chan K. C. Karolyi G. A. Longstaff F. A. Sanders A. B. An empirical comparison of alternative models of the short-term interest rate The Journal of Finance 1992 47 3 1209 1227 10.1111/j.1540-6261.1992.tb04011.x Aït-Sahalia Y. Testing continuous-time models of the spot interest rate The Review of Financial Studies 1996 9 2 385 426 10.1093/rfs/9.2.385 El Karoui N. Peng S. Quenez M. C. Backward stochastic differential equations in finance Mathematical Finance 1997 7 1 1 71 10.1111/1467-9965.00022 MR1434407 ZBL0884.90035 Politis D. N. Romano J. P. A general resampling scheme for triangular arrays of α-mixing random variables with application to the problem of spectral density estimation The Annals of Statistics 1992 20 4 1985 2007 10.1214/aos/1176348899 MR1193322 Cui X. Guo W. Lin L. Zhu L. Covariate-adjusted nonlinear regression The Annals of Statistics 2009 37 4 1839 1870 10.1214/08-AOS627 MR2533473 ZBL1168.62035 Peligrad M. Properties of uniform consistency of the kernel estimators of density and of regression functions under dependence assumptions Stochastics and Stochastics Reports 1992 40 3-4 147 168 MR1275130 10.1080/17442509208833786 ZBL0770.62032 Kim T. Y. Cox D. D. Uniform strong consistency of kernel density estimators under dependence Statistics & Probability Letters 1996 26 2 179 185 10.1016/0167-7152(95)00008-9 MR1381469 ZBL0843.62041