A Stochastic Differential Game of Transboundary Pollution under Knightian Uncertainty of Stock Dynamics

With the robust control framework of Hansen and Sargent (2001), this paper investigates a stochastic differential game of transboundary pollution between two regions under Knightian uncertainty of stock dynamics. Both regions are assumed to play a noncooperative and a cooperative game, and theworst-case pollution accumulation processes for discrete robustness parameters are characterized. Our objective is to identify both regions’ optimal output and emission levels and analyze the effects of the Knightian uncertainty of pollution stock dynamics on both regions’ optimization behavior. We illustrate the results with some numerical examples.


Introduction
Since the last century, the pollution problem is becoming ever more serious with the rapidly developing industrialization.In particular, many kinds of pollutants can spread incredible distances meaning that it is not contained within the boundaries of any single region or nation, so, the "transboundary pollution" has become a problem across regions and even borders.In each region those who suffer from the pollution wishes that the polluter in neighboring regions would either reduce polluting or compensate for the damages and there is generally a game between the sufferers and the polluter around pollution abatement and benefit compensation [1].
In recent years, transboundary pollution problems have raised increasing interest among academic circles and policy makers.Among them, in a dynamic framework, Breton et al. ( 2010) built a model to analyze how countries join international environmental agreements (IEAs).With a dynamic control model, Li [2] studies the outcome of a pollution game between two neighboring countries.Yeung [3] and Jørgensen et al. (2010) analyze the transboundary pollution problems considering the regional government and industrial firms, simultaneously.Huang et al. [4] develop a model in which there is a Stackelberg game between the industrial firms and their local government while governments can cooperate in transboundary pollution control.Yeung and Petrosyan [1] and Yi et al. [5] develop a cooperative stochastic differential game model to analyze transboundary industrial pollution control, in which the uncertain dynamics of pollution are taken into account.
In this paper, we extend Yeung and Petrosyan [1] and Yi et al. 's [5] model to an even more general model.In the models of Yeung and Petrosyan [1] and Yi et al. [5], the uncertain stock dynamics are actually a risk, where the probabilistic structure of pollution stock evolving can be fully captured by a single Bayesian before.However, in our model, the "uncertainty" of evolving stock is seen as a broad term.This "uncertainty" means an inability to posit precise probabilistic structure of stock dynamics.This stems from the concept of uncertainty as introduced by Knight [6] to represent a situation where a decision-maker lacks adequate information to assign probabilities to events.Knight [6] contends that this deeper kind of uncertainty is quite common in economic decision-making and thus deserving of systematic study.Along the way of Knight [6], an axiomatic foundation of maximum expected that utility is established by Gilboa and Schmeidler [7].They believe that it is sensible, when the underlying uncertainty of an economic system is not well understood and axiomatically compelling, to optimize the worst-case outcome.Klibanoff et al. (2009) built a "smooth ambiguity" model, in this model, different degrees of aversion for uncertainty are explicitly parameterized in agents' preferences.Hansen and Sargent [8] and Hansen et al. [9] extend Gilboa and Schmeidler's insight to continuous-time dynamic optimization problems, introducing the concept of robust control to economic environments.They show how standard dynamic programming techniques can be changed to yield robust solutions to problems in which the underlying stochastic nature of the model is not perfectly known.Using the framework of Hansen and Sargent [8], Weitzman (2009) examines the effects of global warming and Athanassoglou and Xepapadeas [10] investigate the problem of pollution controlling with uncertain stock dynamics.In this paper, the framework of Hansen and Sargent [8] is applied to analyze the noncooperative and cooperative optimal emission levels of two neighboring regions and compare and characterize the worst-case pollution accumulation processes for discrete robustness parameter.Further, we consider the problems of the cooperation revenues allocation.
The paper is organized as follows.Section 2 provides the basic model.In Section 3, we characterize the noncooperative outcomes.Cooperative arrangements and individual rationality are analyzed in Section 4. We illustrate the results of a numerical example in Section 5. Section 6 is the results summarizing the paper.

The Basic Model
Consider a multinational economy, which is comprised of two regions.Following Saltari et al. [11] , the output   () of region  ( = 1, 2) at time  ( ∈ [0, +∞)) is a function of emissions   ().That is, where   denotes the technology of region , which is given, and a concave function.
Utility is given by   (  (  ()) −   (())), where   (()) is a damage function of pollution stock (), and where   > 0 is a utility parameter of   and   > 0 ( = 1, 2) denotes the damage parameter of the pollution stock () which evolves in accordance with the following linear differential equation: where 0 <  < 1 denotes the environment's self-cleaning capacity.
Risk is introduced to the standard model so that the stock of the pollutant accumulates according to the diffusion process: where {() :  ≥ 0} is a Brownian motion on an underlying probability space (Ω, Γ, ) and  is the instantaneous standard deviation.If in a world without uncertainty, Region 's objective is to maximize welfare: where  > 0 denotes discount rate and we assume the two regions have the same discount rate.Optimization problem ( 6) is referred to as the benchmark model.If one does not worry about the effects of model misspecification, solving the benchmark problem (6) would be sufficient.However, because there is a Knight Uncertainty, the probabilistic structure implied by stochastic differential equation ( 5) is distorted and the probability measure  is replaced by another .The perturbed model is obtained by performing a change of measure and replacing () in (5) by where { B() :  ≥ 0} is a Brownian motion and {V() :  ≥ 0} is a measurable drift distortion.Thus, changes to the distribution of () are parameterized as drift distortions to a fixed Brownian motion {() :  ≥ 0}.The measurable process V() could correspond to any number of misspecified or omitted dynamic effects such as (i) a miscalculation of exogenous sources of emissions, (ii) a miscalculation of the natural pollution decay rate, and (iii) an ignorance of more complex dynamic structure involving irreversibility, feedback, or hysteresis effects.The distortions will be zero when V ≡ 0 and the two measures  and  coincide.Pollution dynamics under model misspecification are given by We give the following equation which is the relative entropy with the discrepancy between the two measures  and : where   is a entropy constraint.The decision-maker can control the degree of model misspecification by modifying the entropy constraint   ; then, region 's robust control problem over continuous time  ∈ [0, ∞) can be defined as where the parameter  is a Lagrangian multiplier which associated with entropy constraint   () <   .Our choice of  lies in an interval (, +∞), where the lower bound  is a breakdown point beyond which it is fruitless to seek more robustness.On the other hand, when  → ∞ or, equivalently,   = 0, there are no concerns about model misspecification.
The Bellman-Isaacs condition of the robust control problem (10) can be expressed as Having got the Bellman-Isaacs condition of the robust control problem, next, we will apply the robust control methods to investigate the cooperative and noncooperative strategy between both regions.

Problem Solution.
In order to get the noncooperative strategy of both regions gaming for transboundary industrial pollution, first we minimize with respect to V  of the Bellman-Isaacs condition (11) and by which we obtain Substituting the results of (12) into (11), differential equations (13a) and (13b) are given as Investigating differential equations (13a) and (13b), we find that they are actually the HJB equation adding a negative term, and when  → ∞, the negative term in (13a) and (13b) is also close to zero; then the robust control problems become the benchmark control problems.
Next, let us use Proposition 1 to investigate the noncooperative strategy and identify the game equilibrium results with subscript "".
Proof.Solving the first-order partial derivative of (13a) and (13b) with respect to  1 and  2 and setting them equal to zero, we obtain (15a) and (15b).Then solving the second-order partial derivative of (13a) and (13b) with respect to  1 and  2 , (15c) and (15d) are given: From (15a) and (15c), seeing that ∀ > 0,  1 ( * 1 ) >  1 ( * 1 ±), this means to choose the emissions levels  * 1 are optimal for Region 1.In a similar way, from (15b) and (15d), one can find that Region 2 should choose emissions levels  * 2 to maximum her value.Furthermore, we find from (13a) and (13b) that if there is a game equilibrium, this equilibrium must be in the case , where  1 and  2 are response functions of Region 1 to Region 2 and Region 2 to Region 1 when both regions play a noncooperative strategy.In other words, the equilibrium must be a Nash equilibrium.
Maximizing with respect to  in (17a) and (17b), respectively, we obtain Substituting (18a)-(18d) into (16a) and (16b), the values of undetermined parameters are given: 21 ≤ 0 can be determined by the implicit function (19d): Investigating (19a)-(19f), it can be found that the value function is well-defined for  >  2 and diverges for  =  2 .Hence the breakpoint is equal to  =  2 and we from now on only consider  >  =  2 .
This ends the proof.
Investigating the relationship between the value functions and optimal emissions of both regions, we obtain Lemma 2.
Lemma 2. The value functions of both regions are inversely related to pollutant stocks; that is,   / < 0,  = 1, 2.
Proof.From (18a) and (18c), one easily obtain This ends the proof.

Characterizing the Worst-Case Pollution Accumulation
Process.Substituting (18a) and (18c) into (12), we get Substituting (21a) and (21b) into ( 9), the pollution dynamics under model misspecification became the following form: There are two negative effects of model misspecification in (22); one is the additional constant drift term equal to −( 2 /)( 12 () +  22 ()) > 0; this suggests the presence of exogenous sources of pollution beyond those responsible for preindustrial pollution stock  0 ; another is the term of (2 2 /)( 11 () +  21 ()) < 0; this term says that the environment's self-cleaning capacity has been reduced by an amount.
The region  ( = 1, 2) reacts to this worst-case scenario by adopting an emissions strategy  * 1 and  * 2 given by Proposition 1.Therefore, at optimality the worst-case pollution process,  *  , can be given by the following stochastic differential equation: Substituting (14a), (14b), (21a), and (21b) into (23) reduces to where   () and   () are given by Next, let us apply Proposition 3 to probe the solutions of the stochastic differential equation (23).

Proposition 3.
(i) { *  (, ) :  ≥ 0} has expectation and variance (ii) { *  (, ) :  ≥ 0} has a stationary distribution: From Proposition 3, we see that the expected value and variance of the worst-case pollution levels are decreasing in , and when  → ∞, we obtain the following functions: Using Proposition 3, the entropy of the worst-case model misspecification can be given:

Cooperative Arrangements
Now consider the case when both regions cooperate in pollution control.To uphold the cooperative scheme, both group rationality and individual rationality are required to be satisfied at any time.

Group Optimality and Cooperative State Trajectory.
To secure group optimality, the participating two regions would seek to maximize their joint expected payoff by solving following stochastic control problem: The Bellman-Isaacs condition of the robust control problem (32) is given by where Minimizing with respect to V 1 and V 2 of the Bellman-Isaacs condition (33), respectively, we obtain Substituting the results of (34) into (33), it takes the following form: Next, we will apply Proposition 4 to investigate the cooperative strategy of both regions and identify the game equilibrium results with subscript "".

Proposition 4. If both regions use cooperative strategy, their optimal instantaneous emissions can be given as
where  1 (),  2 (), and  3 () are undetermined parameters related to .
Proof.Solving the first-order partial derivative of (35) with respect to  1 and  2 and setting it equal to zero, respectively, we get Then identify following second-order partial derivatives: According to these second-order conditions, one can conclude that  * 1 and  * 2 maximize the cooperative value ; in other words,  * 1 and  * 2 are optimal for the partners.
Substituting (37a) and (37b) into (35), we get In order to solve the value functions (38), we make the following assumption: where undetermined parameters  1 (),  2 (), and  3 () can be given by Inspecting (40a)-(40c), we find that the value function is well-defined for  >  2 and diverges for  =  2 .Hence the breakpoint is equal to  =  2 and we from now on only consider  >  =  2 .
This ends the proof.
Investigating the relationship between the value functions and optimal emissions of both regions under cooperative strategy, we obtain the following Lemma.

Lemma 5. (i) The difference between both regions' optimal emissions under cooperative strategy is equal to the difference between both regions' utility parameter; that is, 𝐸
(ii) The cooperative total value functions are inversely related to pollutant stocks; that is,   / < 0.

Characterizing the Cooperative Worst-Case Pollution
Accumulation Process.Substituting (41) into (34) yields Substituting the results of (43) into ( 9), the pollution dynamics under model misspecification became the following form: There are two negative effects of model misspecification in (44); one is the additional constant drift term equal to −2 2  2 ()/ > 0; this suggests the presence of exogenous sources of pollution beyond those responsible for preindustrial pollution stock  0 ; another is the term of 4( 2 /) 1 () < 0; this term shows that the environment's self-cleaning capacity has been reduced by an amount.
Both regions react to this worst-case scenario by adopting an emissions strategy  * 1 and  * 2 given by (36a) and (36b), respectively.Therefore, at optimality the worst-case pollution process  *  reduces to Substituting (36a), (36b), and ( 43) into (45), one gets the stochastic differential equation ( 46): where   () and   () are given by Next, we rely on Proposition 6 to show the solutions of the stochastic differential equation ( 46).Proposition 6.The expectation, variance, and a stationary distribution of { *  (, ) :  ≥ 0} can be given as From Proposition 6, we find that the expected value and variance of the worst-case pollution levels are decreasing in , and when  → ∞, we obtain following functions: From Proposition 6, the entropy of the worst-case model misspecification under cooperative strategy can be given:

Individually Rational and Time-Consistent Imputation and Payment Distribution
Mechanism.An agreed upon optimality principle must be sought to allocate the cooperative payoff.In a dynamic framework individual rationality has to be maintained at every instant of time within the cooperative duration [ 0 , ∞) along the cooperative trajectory (46).Let  *  denote the set of realizable values of  * () at time  generated by (46).The term  *  is used to denote an element in the set  *  .For  ∈ [ 0 , ∞), let vector   ( *  ) = { 1 ( *  ),  2 ( *  )} denote the solution imputation (payoff under cooperation) over the period [, ∞) to Region  ∈ {1,2} given that the state is  *  ∈  *  .Individual rationality along the cooperative trajectory requires where  () (,  *  ) denote the payoff to region  under noncooperation over the period.Let () = [ 1 (),  2 ()] denote the instantaneous payoff of the cooperative game at time  ∈ [ 0 , ∞) for the cooperative game Γ  ( *  0 ).We apply Proposition 7 to investigate the time-consistent imputation.
Proof.Along the cooperative trajectory {() * } ≥ 0 , we define Note that Expression (57) means that the extension of the solution policy to a situation with a later starting time and along the optimal trajectory remains optimal, so condition in (57) guarantees time consistency of the solution imputations.
Next, we consider time consistent solutions under specific optimality principles.From Yeung and Petrosyan [1], let us use "the principle of equality" to build a payment distribution mechanism under which both regions' expected gain from cooperation is shared proportionally to the regions' relative sizes of expected noncooperative payoffs.In accordance with the principle of equality, the imputation scheme has to satisfy the following.
In the game Γ  ( 0 ), at time  0 and at time  ∈ [ 0 , ∞), an imputation in shown as (64) and (65), respectively: where  ∈ {1,2}.Appling Proposition   The specific payment imputation in the game Γ  ( 0 ) can be given by using the distribution mechanism (66) and the value functions (17a), (17b), and (39) at time  0 , respectively: Having obtained the cooperative and noncooperative strategy and cooperative residual distribution mechanism applied to satisfy individual rational condition, next, some numerical examples are wielded to investigate the results of model analysis.

Numerical Examples
In this section, we will apply the solutions of model reached at Sections 3 and 4 to perform a numerical exercise that provides some context for the theoretical results.The parameters used in the numerical examples are presented in Table 1; in particular, we assume the robust control parameter  equal to 0.9, 1.9, 2.9, and 3.9, respectively.In theory, the greater the value of robust control parameter , the more cautious the region.We use version 7.0 of the Wolfram Mathematica Matlab to obtain the numerical solutions.
First, through Figures 1 and 2, we investigate the effects of robust control parameter  on the V * 1 and V * 2 .From Figures 1 and 2, we find that V * 1 and V * 2 are convex and decreasing with time  and  has significant influence on V * 1 and V * 2 .The smaller the value of , the greater the changes in time evolution of V * 1 and V * 2 , which reflects that the more cautious the region, the smaller the worst-case misspecification.
Next, the effects of robust control parameter  on optimal emission strategy of both regions are shown in Figures 3 and  4.
From Figures 3 and 4, we find that  * 1 and  * 2 are concave and decreasing with time  and  has significant influence on  * 1 and  * 2 .The smaller the value of , the greater the change in the time evolution of  * 1 and  * 2 which reflects that the more cautious the region, the smaller the amount of optimal emission of both regions.Next, in Figure 5 the optimal time evolution paths of the pollution stock are shown under different robust control parameters .
Figure 5 shows that optimal pollution stock is concave and decreasing with time  and  has significant influence on it.The smaller the value of , the greater the change in the time evolution of the optimal pollution stock which reflects that the more cautious the region, the smaller the amount of the optimal pollution stock; that is to say, the precautious participant tends to reduce pollution emissions.The noncooperative emissions strategy and gaming results have been shown in Figures 3-5.Next, we use Figures 6-8 to carry out numerical analysis of cooperative emissions strategy and gaming results.
Figures 6-7 show that both regions' optimal emission levels are concave and decreasing with time  under cooperation and the optimal emission levels tend to reduce with the increasing of robust control parameters . Figure 8 shows that the optimal pollution stock is convex and decreasing with time .
From Figures 1-8, one can see the effects of robust control parameters  on the behavior of Region ,  ∈ {1,2}, in cooperation and noncooperation fully.In order to contrast the cooperative strategy and noncooperative strategy, we when  = 0.9.perform several analyses shown in Figures 9-12, in which the robust control parameters  are fixed at 0.9.Figures 9-10 show that, compared to noncooperative strategy, both regions' optimal emission levels are lower when both regions play a cooperative game.display that at any point in time, compared to noncooperative strategy, both regions earn higher net benefits when they play a cooperative game.This means that the individual rationality condition and group rationality condition are satisfied simultaneously.Therefore, our cooperative residual distribution mechanism given in Section 4 can ensure a subgame-consistent solution.

Conclusion
In this paper, we apply the robust control framework of Hansen and Sargent [8] to investigate a stochastic differential game of transboundary pollution between two regions under Knightian uncertainty of stock dynamics.Both regions are presumed to play a Markov Nash equilibrium strategy and a cooperative strategy and the worst-case pollution accumulation processes for discrete robustness parameters are characterized.The results show the following.
(i) Under Knightian uncertainty of stock dynamics, when the regions get more cautious, the model miscalculation reduces and the time path of each region's optimal emission levels is significantly decreasing.
(ii) Compared to noncooperative strategy, both regions' optimal emission levels are lower when both regions play a cooperative game.
(iii) Compared to noncooperative strategy, both regions earn higher net benefits when they play a cooperative game.
(iv) Under the mechanism that the expected gain from cooperation is shared proportionally to the regions' relative sizes of expected noncooperative payoffs, the individual rationality condition and group rationality condition can be satisfied simultaneously which ensures a subgame-consistent solution.

Figure 1 :
Figure 1: The optimal paths of the time evolutions of V * 1 .

Figure 2 :Figure 3 :
Figure 2: The optimal paths of the time evolutions of V * 2 .

Figure 11 : 1 when
Figure 11: The optimal paths of the time evolution of  * 1 and N * 1

Figure 12 : 2 when
Figure 12: The optimal paths of the time evolution of  * 2 and N * 2

Table 1 :
The parameters used in the numerical example.