A BSDE Approach to Stochastic Differential Games with Regime Switching

In this paper, we study a two-player zero-sum stochastic differential game with regime switching in the framework of forwardbackward stochastic differential equations on a finite time horizon. By means of backward stochastic differential equation methods, in particular that of the notion from stochastic backward semigroups, we prove a dynamic programming principle for both the upper and the lower value functions of the game. Based on the dynamic programming principle, the upper and the lower value functions are shown to be the unique viscosity solutions of the associated upper and lower Hamilton–Jacobi–Bellman–Isaacs equations.


Introduction
e differential game is concerned with the problem that multiple players make decisions, according to their own advantages and trade-off with other partners in a dynamic system. Stochastic differential games (SDGs) have been well studied. Recently, Lv [1] studied the two-player zero-sum SDGs in a regime switching model with an infinite horizon. Compared with the traditional diffusion model, the regime switching model has two obvious advantages. First, the underlying Markov chain can be used to model discrete events with larger long-term system impact. For instance, in financial markets, it is easy to capture market trend by using finite state Markov chain. However, it is difficult to incorporate this dynamic into pure diffusion model. Second, when conducting numerical experiments, regime switching models require very limited data input. In recent years, due to the capacity for characterizing all kinds of random events and the tractability, regime switching models have attracted extensive attention [1][2][3]. In this paper, we introduce a new method, which is different from the method in [1]. We investigate two-player zero-sum SDGs with regime switching on a finite time horizon by using the backward stochastic differential equation (BSDE) methods.
Pardoux and Peng [4] first introduced the nonlinear BSDEs in 1990. e theory of BSDE was originally developed by Peng [5] for stochastic control theory. And later Hamadène and Lepeltier [6] and Hamadène et al. [7] introduced this theory to SDGs. Buckdahn and Li [8] studied a recursive SDG problem and interpreted the relationship between the controlled system and the Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation. e theory of BSDEs has been well studied and applied to many fields, such as stochastic control, SDGs, mathematical finance, and partial differential equation theory (see [5][6][7][9][10][11] for details). e readers interested in other topics about game theory are referred to [12][13][14][15].
In this paper, let (Ω, F, P) be a fixed probability space on which a d-dimensional Brownian motion B s s∈[0,T] and a Markov chain θ s s∈ [0,T] are defined on some sample space Ω, F is the completed Borel σ-algebra over Ω, and P is the Wiener measure. Here, we assume that F � (F s ) s∈ [ [0,T] . And (F θ s ) s∈[0,T] denote the filtration generated by the Markov chain θ s s∈ [0,T] . Assume that B · and θ · are independent. e Markov chain θ s s∈ [0,T] takes values in a finite state space M � 1, . . . , m { } and is observable. And the generator Q � (q ik ) i,k∈M ∈ R m×m of the Markov chain θ s s∈ [0,T] is given by where q ik is the transition rate from market regime i to k, q ii � − k≠i q ik < 0, and q ik ≥ 0, for every (i, k) ∈ M × M. We will investigate a two-player zero-sum SDG with regime switching in the framework of BSDE on a finite time horizon. e dynamics of the SDG are described by the following functional stochastic differential equation ( [t,T] are the pair of F t -adapted processes, take their values in some compact metric spaces U and V, and are called admissible controls of the two players I and II, respectively. Precise assumptions on the coefficients b and σ are given in the next section.
e cost functional is introduced by BSDE: where X t,x,i;u,v and θ t,i are introduced in (2). e above BSDE has a unique solution T] . And for given control processes u ∈ U t,T and v ∈ V t,T , we introduce the associated cost functional: where Y t,x,i;u,v is defined by BSDE (3). In the game, player I aims to maximize (4) and contrarily player II aims to minimize (4). We define the lower and the upper value functions W and U, respectively: Precise definitions of α and β are given in the next section. In the case W � U we say that the game admits a value. e main objective of this paper is to show that W and U are, respectively, the unique viscosity solutions of the following lower and upper HJBI equations, and both are systems consisting of m coupled equations: where H is defined as then (5) and (6) coincide, and the uniqueness of viscosity solution implies W � U, that is, the game admits a value. e paper is organized as follows. In Section 2, we introduce some notations and preliminaries, which will be needed in what follows. In Section 3, we introduce the dynamic programming principle. In Section 4, based on the dynamic programming principle, we investigate that the upper and the lower value functions are the unique viscosity solutions of the associated upper and lower HJBI equations.

Preliminaries
Let us introduce the following spaces, which will be needed in what follows.

Mathematical Problems in Engineering
For the proof of the above two lemmas, the readers can refer to [11,16].
We now consider the assumptions on the coefficients b and σ. e coefficients b: e mappings b and σ satisfy the following conditions: (A3) (i) For every fixed x ∈ R n , i ∈ M, b(·, x, i, ·, ·) and σ(·, x, i, ·, ·) are continuous with respect to (t, u, v). (ii) For any x, x ′ ∈ R n , i ∈ M, u ∈ U and v ∈ V, there exists a positive constant C such that From (A3), we can get the global linear growth conditions of b and σ, i.e., the existence of some C > 0 such that, for Suppose the above assumptions hold; for any u ∈ U and v ∈ V, control system (2) has a unique solution Suppose that the two functions f: and the terminal cost Φ: R n ⟶ R satisfy the following conditions: Under the above conditions, (3) has a unique solution T] . And we have the following estimates.
For the proof of this lemma, the readers can refer to [17]. Now, we introduce the admissible controls and admissible strategies. Let t 1 , t 2 be two deterministic times, and is a process taking values in U (resp., V), progressively measurable with respect to the filtration F, where F � F r , r ∈ [t 1 , t 2 ] is the filtration generated by B and θ.
e set of all admissible controls for player I (resp., player II) on time [t, T] is denoted by U t,T (resp., V t,T ).

Definition 2. A nonanticipative strategy for player I on [t, T] is a mapping
In the same way, we define a nonanticipative strategy β: e set of all nonanticipative strategies for player I (resp., player II) on [t, T] is denoted by A t,T (resp., B t,T ). Now we give some properties about the lower and the upper value functions W and U. e following lemma was established in [8], and the situation was slightly different. For the proof of this lemma, the readers can refer to [8].

Lemma 6. Under the assumptions (A3) and (A4), for all
From (4), (5), and (22), we get the properties of the lower value function W in the following.

Lemma 7. Under the assumptions (A3) and (A4), for all
e same properties hold true for the function U.

Dynamic Programming Principles
e dynamic programming principle is one of the principal and most commonly used methods to solve the optimal control problem. In this section, we present the dynamic programming principle for a two-player zero-sum SDG with regime switching in the framework of BSDE on a finite time horizon. It will be used in the next section.
We first introduce the backward stochastic semigroup. For given initial state (t, x, i), a positive number δ ≤ T − t, for admissible control processes u · ∈ U t,t+δ and v · ∈ V t,t+δ , and a real value random variable η ∈ (Ω, F t+δ , P; R), we define t+δ] is the solution of the following BSDE with terminal time t + δ: and X t,x,i;u,v is the solution of SDE (2). According to the uniqueness of the solution of the BSDE, we observe that for the solution Y t,x,i;u,v of BSDE (3), we have We now introduce the dynamic programming principle for the value functions of SDGs with regime switching.

Proposition 1. Under the assumptions (A3) and (A4), the following dynamic programming principle holds: for all
We proceed with the proof that W δ (t, x, i) coincides with W(t, x, i) into the following steps.
Step 1. Let β ∈ B t,T be arbitrarily fixed. en, given a u 2 ∈ U t+δ,T , we define as follows the restriction β 1 of β to U t,t+δ : where u 1 ⊕ u 2 ≔ u 1 χ [t,t+δ] + u 2 χ (t+δ,T] extends u 1 to an element of U t,T . Obviously, β 1 ∈ B t,t+δ . And, from the nonanticipative property of β we deduce that β 1 is independent of the special choice of u 2 ∈ U t+δ,T . us, from the definition of W δ (t, x, i), Mathematical Problems in Engineering 5 and we use the notation Let ε > 0 and set from the nonanticipativity of β 1 , we have β 1 (u ε 1 ) � l≥1 χ Γ l β 1 (u 1 l ). According to the existence and uniqueness of the BSDEs, it follows that for β 1 ∈ B t,t+δ . Hence, We now focus on the interval [t + δ, T]. Because β 1 (·) ≔ β(·⊕u 2 ) ∈ B t,t+δ does not depend on u 2 ∈ U t+δ,T , we can define β 2 (u 2 ) ≔ β(u ε 1 ⊕u 2 )| [t+δ,T] , for any u 2 ∈ U t+δ,T . From β ∈ B t,T , we know that β 2 : U t+δ,T ⟶ V t+δ,T belongs to B t+δ,T . us, from the definition of W(t + δ, y, j), for any (y, j) ∈ R n × M, From Lemmas 5 and 7, there exists a constant C ∈ R such that, for any u 2 ∈ U t+δ,T , y, y ′ ∈ R n , We can show by approximating X To estimate the right side of the latter inequality, we note that there exists some sequence u 2 Let ε > 0 and set Mathematical Problems in Engineering forms an (Ω, F t+δ )-partition; moreover, u ε 2 ≔ k≥1 χ △ k u 2 k ∈ U t+δ,T . erefore, from the nonanticipativity of β 2 , we have and from the definition of β 1 , β 2 , we know that β(u ε 1 ⊕u ε 2 ) � β 1 (u ε 1 )⊕β 2 (u ε 2 ). According the existence and uniqueness of our BSDE, it follows that erefore, where u ε ≔ u ε 1 ⊕ u ε 2 ∈ U t,T . From (35) and (41), Since β ∈ B t,T has been arbitrarily chosen, we have (42) for all β ∈ B t,T . us, Step 2. We now deal with the other case:

Viscosity Solution of Isaacs' Equation: Existence and Uniqueness Theorem
In this section, based on the dynamic programming principle, we want to prove that the lower value function W(t, x, i) introduced by (5) is the viscosity solution of (7), while the upper value function U(t, x, i) defined by (6) is the viscosity solution of (8). Moreover, if Isaacs' condition holds, i.e., H − � H + , then (7) and (8) coincide, and the uniqueness of viscosity solution implies that W � U, that is, the game admits a value. We first recall the definition of a viscosity solution of (7). e one for (8) can be defined in a similar way Definition 3. A continuous W ∈ (C([0, T] × R n )) m is said to be a viscosity subsolution (resp., supersolution) of (7), if W(T, x, i) ≤ Φ(x, i) (resp., W(T, x, i) ≥ Φ(x, i)), for all (x, i) ∈ R n × M and if for all functions ϕ ∈ C 3 l,b ([0, T] × R n ) and (t, x, i) ∈ [0, T) × R n × M such that W(·, ·) − ϕ(·) attains its local maximum (resp., minimum) value zero at (t, x), it has

Mathematical Problems in Engineering
where W is called a viscosity solution if it is both a viscosity subsolution and a viscosity supersolution.
denotes the set of the realvalued functions that are continuously differentiable up to the third order and whose derivatives of order from 1 to 3 are bounded.
In the following, we prove that the lower value function W is a viscosity solution of (7). We only focus on the lower value function W, and the results hold for the upper value function U in a similar procedure.

Theorem 1. Under the assumptions (A3) and (A4), the lower value function W(t, x, i) is a viscosity solution of (7).
First we prove some auxiliary lemmas. To abbreviate notations, for some arbitrarily chosen but fixed and k≠i q ik < 1, and we consider the following BSDE defined on the interval [t, where the process X t,x,i;u,v s has been introduced by (2), (t, x, i) ∈ [0, T] × R n × M is regarded as the initial state, X s is the value of X at time s, and u ∈ U t,t+δ , v ∈ V t,t+δ .
We can characterize the solution process Y 1,u,v as follows.

Lemma 8. For any
, and φ(·, ·, k) � W(·, ·, k), and for k ≠ i, we have the following relationship: Proof. G t,x,i;u,v s,t+δ [φ(t + δ, X t,x,i;u,v t+δ , θ t,i t+δ )] is defined with the help of the solution of the BSDE: by the following formula: us, we only need to prove by applying Dynkin's formula to φ(s, X t,x,i;u,v s , θ t,i s ). And for s � t + δ, erefore, for any s ∈ [t, t + δ], we get the desired result.

□
We consider the following simple BSDE in which X t,x,i;u,v s is replaced by its deterministic initial value x: en, we have the following lemma.

Lemma 9.
ere is a constant C > 0 independent of the control processes u, v and of δ > 0, such that for every u ∈ U t,t+δ , v ∈ V t,t+δ , Proof. From Lemma 4, we have the existence of some constant C > 0 such that From (56) and (62), using Lemma 2, set It is easy to know that g is Lipschitz with respect to (y, z), and |φ 2 (s) where ρ 0 (r) � (1 + |x| 2 )(r + r 3 ), r ≥ 0. erefore, Mathematical Problems in Engineering 11 e proof is complete. □ Lemma 10. Let Y 0 (·) be the solution of the following ordinary differential equation: where en, P − a.s., Proof. We first introduce the function where (s, x, y, z, i, u) And we consider the following equation: for u ∈ U t,t+δ , Since F 1 (s, x, y, z, i, u s ) is Lipschitz in (y, z), for every u ∈ U t,t+δ , there exists a unique solution (Y 3,u , Z 3,u ) to (74).
en, for every u ∈ U t,t+δ , In fact, from Lemma 2 and the definition of F 1 , for every u ∈ U t,t+δ , we have Moreover, there exists a measurable function v 4 : x, y, z, i, u) for any s, x, y, z, u. (77) en, since F 0 (s, x, y, z, i) � sup u∈U t,t+δ F 1 (s, x, y, z, i, u), by a similar proof, we have It uses the fact that (70) can be considered as a BSDE with the solution (Y s , Z s ) � (Y 0 (s), 0). So, the proof is complete.
where the constant C is independent of the control processes u, v and of δ > 0.
Proof. Since F(s, x, ·, ·, i, u, v) has a linear growth in (y, z), uniformly in (u, v), we get from Lemma 2 for some constant C independent of δ and the controls u, v, Moreover, from equation (62), and since we get where 0 ≤ δ ≤ T − t and from W ≥ φ and the monotonicity property of G and Lemma 10 implies Y 0 (t) ≤ Cδ (3/2) , P − a.s., where Y 0 is the unique solution of (70). us, and from the definition of F, we see that W is a viscosity supersolution of (7). (ii) Now we prove W is a viscosity subsolution. For fixed i ∈ M, without loss of generality, we suppose that φ(t, x, i) � W(t, x, i). We must prove that sup u∈U inf v∈V F(t, x, 0, 0, i, u, v) � F 0 (t, x, 0, 0, i) ≥ 0.
We suppose that this is not true. en, there exists some R > 0 such that and we can find a measurable function ψ: U ⟶ V such that F(t, x, 0, 0, i, u, ψ(u)) ≤ − 3 4 R for all u ∈ U.