Maximum process problems in optimal control theory

Given a standard Brownian motion ( B t ) t ≥ 0 and the equation of motion d X t = v t d t + 2 d B t , we set S t = max 0 ≤ s ≤ t X s and consider the optimal control problem sup v E ( S τ − C τ ) , where 0$" xmlns:mml="http://www.w3.org/1998/Math/MathML"> c > 0 and the supremum is taken over all admissible controls v satisfying v t ∈ [ μ 0 , μ 1 ] for all t up to 0 \mid X_t \notin (\ell_0,\ell_1)\}$" xmlns:mml="http://www.w3.org/1998/Math/MathML"> τ = inf { t > 0 | X t ∉ ( l 0 , l 1 ) } with μ 0 0 μ 1 and l 0 0 l 1 given and fixed. The following control v ∗ is proved to be optimal: “pull as hard as possible,” that is, v t ∗ = μ 0 if X t g ∗ ( S t ) , and “push as hard as possible,” that is, v t ∗ = μ 1 if g_*(S_t)$" xmlns:mml="http://www.w3.org/1998/Math/MathML"> X t > g ∗ ( S t ) , where s ↦ g ∗ ( s ) is a switching curve that is determined explicitly (as the unique solution to a nonlinear differential equation). The solution found demonstrates that the problem formulations based on a maximum functional can be successfully included in optimal control theory (calculus of variations) in addition to the classic problem formulations due to Lagrange, Mayer, and Bolza.


Introduction
Stochastic control theory deals with three basic problem formulations which were inherited from classical calculus of variations (cf.[4, pages 25-26]).Given the equation of motion dX t = µ X t ,u t dt + σ X t ,u t dB t , ( where (B t ) t≥0 is standard Brownian motion, consider the optimal control problem where the infimum is taken over all admissible controls u = (u t ) t≥0 applied before the exit time τ D = inf{t > 0 | X t / ∈ C} for some open set C = D c and the process (X t ) t≥0 starts at x under P x .If M ≡ 0 and L = 0, problem (1.2) is said to be Lagrange formulated.If L ≡ 0 and M = 0, problem (1.2) is said to be Mayer formulated.If both L = 0 and M = 0, problem (1.2) is said to be Bolza formulated.
The Lagrange formulation goes back to the 18th century, the Mayer formulation originated in the 19th century, and the Bolza formulation [2] was introduced in 1913.We refer to [1, pages 187-189] and the references therein for a historical account of the Lagrange, Mayer, and Bolza problems.Although the three problem formulations are formally known to be equivalent (see, e.g., [1, pages 189-193], [4, pages 25-26]), this fact is rarely proved to be essential when solving a concrete problem.
Setting Z t = L(X t ,u t ) or Z t = M(X t ), and focusing upon the sample path t → Z t for t ∈ [0,τ D ], we see that the three problem formulations measure the performance associated with a control u by means of the following two functionals: where the first one represents the surface area below (or above) the sample path, and the second one represents the sample path terminal value.In addition to these two functionals, it is suggested by elementary geometric considerations that the maximal value of the sample path provides yet another performance measure which, to a certain extent, is more sensitive than the previous two.Clearly, a sample path can have a small integral but still a large maximum, while a large maximum cannot be detected by the terminal value either.
The main purpose of the present paper is to show that the problem formulations based on a maximum functional can be successfully added to optimal control theory (calculus of variations).This is done by formulating a specific problem of this type (Section 2) and solving it in a closed form (Section 3).The result suggests a number of new avenues for further research upon extending the Bolza formulation (1.2) to optimize the following expression: where some of the maps K, L, and M may also be identically zero.Optimal stopping problems for the maximum process have been studied by a number of authors in the 1990's (see, e.g., [6,3,10,8,5]) and the subject seems to be wellunderstood now.

Formulation of the problem
Consider a process X = (X t ) t≥0 solving the stochastic differential equation (s.d.e.) where B = (B t ) t≥0 is standard Brownian motion, and associate with X the maximum process so that X 0 = x and S 0 = s under P x,s , where x ≤ s.Introduce the exit time where 0 < 0 < 1 are given and fixed, and so let c > 0 in the sequel.
The optimal control problem to be examined in this paper is formulated as follows: where the supremum is taken over all "admissible" controls v satisfying v t ∈ [µ 0 ,µ 1 ] for all 0 ≤ t ≤ τ with some µ 0 < 0 < µ 1 given and fixed.By "admissible" we mean that the s.d.e.(2.1) can be solved in Itô's sense (either strongly or weakly).Since v t is required to be uniformly bounded, it is well known that a weak solution (unique in law) always exists under a measurability condition (see, e.g., [9, page 155]), where v t may depend on the entire sample path r → X r up to time t.Moreover, if v t = v(X t ) for some (bounded) measurable function v, then a strong solution (pathwise unique) also exists (see, e.g., [9, pages 179-180]).
The optimal control problem (2.4) has some interesting interpretations.Equation (2.1) may be viewed as describing the motion of a Brownian particle (subject to a fluctuating force ∼ Ḃt ) that is under the influence of a (slowly varying) external force ∼ v t (see [7, pages 53-78]).The objective in (2.4) is therefore to determine an optimum of the external force that one needs to exert upon the particle so as to make its maximal height at the time of exit as large as possible in the course of time needed for the same exit to happen as short as possible.Clearly, the interpretation and objective just described carry over to many other problems where (2.1) plays a role.
It appears intuitively clear that the optimal control should be of the following bangbang type: at each time either "push" or "pull" as hard as possible so as to reach either 1 or 0 as soon as possible.The solution of the problem presented in the next section confirms this guess and makes the statement precise in analytic terms.It is also apparent that at each time t we need to keep track of both X t and S t so that the problem is inherently two dimensional.

Solution of the problem
(1) In the setting of the previous section, consider the optimal control problem (2.4).Note that Xt = (X t ,S t ) is a two-dimensional process with the state space x ≤ s} that changes (increases) in the second coordinate only after hitting the diagonal x = s in R 2 .Off the diagonal, the process X = ( Xt ) t≥0 changes only in the first coordinate and thus may be identified with X.Moreover, if v t = v(X t ) for some (bounded) measurable function v in (2.1), then X is a Markov process.The later "feedback" controls are sufficient to be considered under fairly general hypotheses (see, e.g., [4, pages 162-163]), and this fact will also be proved below.The infinitesimal generator of X may be therefore formally described as follows: where L X is the infinitesimal generator of X.This means that the infinitesimal generator of X is acting on a space of C 2 functions f on S satisfying lim x↑s (∂ f /∂s)(x,s) = 0.The formal description (3.1)+(3.2) appears in [8, pages 1618-1619], where the latter fact is also verified.The condition of normal reflection (3.2) was used for the first time in [3] in the case of Bessel processes (it was also noted in [6, page 1810] in the case of a Bessel process of dimension one).
(2) Assuming for a moment that the supremum in (2.4) is attained at some feedback control, and making use of the formal description of the infinitesimal generator (3.1)+(3.2),we are naturally led to formulate the following HJB system: J(x,s) x= 0+ = s (instantaneous stopping), (3.5) for 0 < s < 1 , where the infinitesimal generator of X given v is expressed by and we set Our main effort in the sequel will be directed to solving the system (3.3)-(3.6) in a closed form.More explicitly, the HJB equation (3.3) with J = J(x,s) reads as follows: so that we may expect a bang-bang solution v * depending on the sign of J x .If J x < 0, then v * t = µ 0 , and if J x > 0, then v * t = µ 1 .The equation J x = 0 defines an optimal "switching" curve s → g * (s), and the main task in the sequel will be to determine it explicitly.
Further heuristic considerations based on the bang-bang principle just stated (when close to 0 apply µ 0 so to exit at 0 , and when close to 1 apply µ 1 so to exit at 1 ) suggest to partition C into the following three subsets (modulo two curves x = g * (s) and to be found): ) ) where s * is a unique point in ( 0 , 1 ) satisfying In addition to (3.11), we also set to denote another point in ( 0 , 1 ) playing a role.We refer to Figure 3.1 to obtain a better geometric understanding of (3.9)-(3.13).[In Figure 3.1 the state space (triangle) of the process (X t ,S t ) from (2.1)+(2.2) splits into three regions.In the region C 1 , the optimal control v * t equals µ 0 , and in the region C 2 ∪ C 3 the optimal control v * t equals µ 1 .The switching curve s → g * (s) is determined as the unique solution of the differential equation (3.26) satisfying g * ( 1 ) = x * , where x * ∈ ( 0 , 1 ) is the unique solution of the transcendental equation (3.25).In this specific case, we took 0 = µ 0 = −1, 1 = µ 1 = 1, and c = 2.It turns out that x * = 0, s * = −0.574108 and s ∞ = −0.718666. (The point s ∞ is a singularity point at which dg * /ds = +∞ and g * takes a finite value.)] (3) We construct a solution to the system (3.3)-(3.6) in three steps.In the first two steps, we determine C 1 ∪ C 2 together with a boundary curve s → g * (s) separating C 1 from C 2 .
Step 1.Consider the HJB equation in C 1 to be found.The general solution of (3.14) is given by where a 0 (s) and b 0 (s) are some undetermined functions of s.Using (3.5), we can eliminate b 0 (s) from (3.15) and this yields Solving J x (x,s) = 0 for x gives x * = g * (s) as a candidate for the switching curve, and also that a 0 (s) can be expressed in terms of g * (s) as follows: Inserting this back into (3.16)gives as a candidate for the value function (2.4) when (x,s) ∈ C 1 .
Step 2. Consider the HJB equation in C 2 to be found.The general solution of (3.19) is given by where a 1 (s) and b 1 (s) are some undetermined functions of s.Solving J x (x,s) = 0 for x gives x * = g * (s) as a candidate for the switching curve, and also that a 1 (s) can be expressed in terms of g * (s) as follows: Inserting this back into (3.20)gives

.22)
The two functions (3.18) and (3.22) must coincide at the switching curve, that is, giving a closed expression for b 1 (s), which after being inserted back into (3.22)yields as a candidate for the value function (2.4) when (x,s) ∈ C 2 .
The condition (3.6) with x = s = 1 can now be used to determine a unique x * ∈ ( 0 , 1 ) satisfying (3.13).It follows from the expression (3.24) above that x * solves (and is uniquely determined by) the following transcendental equation: It may be noted that this equation, and therefore x * as well, does not depend on c.Finally, applying the condition (3.4) to the expression (3.24), we obtain a differential equation for the switching curve s → g * (s) that takes the following form: for all s ∈ (s ∞ , 1 ), where s ∞ < 1 happens to be a singularity point at which dg/ds = +∞.Equation (3.26) is solved backwards under the initial condition (3.13),where x * ∈ ( 0 , 1 ) is found by solving (3.25).The switching curve s → g * (s) is a unique solution of (3.26) obtained in this way.It can also be verified that (3.12) holds for some s * ∈ (s ∞ , 1 ).( 4) It turns out that the solution of (3.26) satisfying (3.13) can hit the diagonal in R 2 at a point s * ∈ ( 0 , 1 ) taken to be closest to x * , if c ≥ c * for some large c * to be determined below.When this happens, the construction of the solution becomes more complicated, and the solution of (3.26) for s ∈ (s ∞ ,s * ) is of no use.We thus first treat the simpler case when the solution of (3.26) stays below the diagonal (the case of "small" c), and this case is then followed by the more complicated case when the solution of (3.26) hits the diagonal (the case of "large" c).
Step 3 (the case of "small" c).Having characterized the curve s → g * (s) for [s * , 1 ], we have obtained a candidate (3.18)+(3.24)for the value function (2.4) when (x,s) ∈ C 1 ∪ C 2 .It remains to determine J(x,s) for (x,s) ∈ C 3 , and this is what we do in the final step.
As clearly the control µ 1 should be applied, consider the HJB equation in C 3 given and fixed.The general solution of (3.27) is given by where a 2 (s) and b 2 (s) are some undetermined functions of s.Using the condition (3.5), we can eliminate b 2 (s) from (3.28) and this yields Applying the condition (3.4) to the expression (3.29), we find that a 2 (s) should solve (3.30) for 0 < s < s * .This equation can be solved explicitly and this gives where d is an undetermined constant.To determine d, we may use the fact that the maps (3.24) and (3.29) must coincide at (s * ,s * ), that is, the latter being known explicitly due to (3.12).With this d we can then rewrite (3.31) as follows: Inserting this expression back into (3.29),we obtain a candidate for the value function (2.4) when (x,s) ∈ C 3 , thus completing the construction of a solution to the system (3.3)-(3.6).
Steps 3-5 (the case of "large" c).In this case, the solution s → g * (s) of (3.26) satisfying (3.13) hits the diagonal in R 2 at some s * ∈ ( 0 , 1 ).The set C 1 from (3.9) naturally splits into the following three subsets (modulo two curves): where s * , the map s → h * (s), and s * in this context will soon be defined.Similarly, the set C 2 from (3.10) naturally splits into the following two subsets (modulo one curve): x * It is clear from the construction above that in C 1,1 the value function (2.4) is given by (3.18), and that in C 2,1 the value function (2.4) is given by (3.24), where s → g * (s) is the solution of (3.26) satisfying (3.13).For the points s < s * (close to s * ) we can no longer make use of the solution g * (s), and the expression (3.18), as clearly the condition (3.4) fails to hold.We need, moreover, to apply the control µ 0 instead of µ 1 , and thus Step 3 presented above must be modified.The HJB equation (3.27) is considered with µ 0 in place of µ 1 , and this again using (3.5) leads to the expression (3.29) with µ 0 in place of µ 1 : 86 Maximum process and optimal control as well as (3.30) and its solution (3.31).To determine d in (3.31), we may use that for i either 1 or 2, where we set J (1,1) = J (1) and J (2,1) = J (2) with J (1) from (3.18) and J (2)  Note that (3.44) is reminiscent of (3.12), and so is h * (s) of g * (s).However, the two functions are different for s < s * .It is clear that the value function (2.4) is also given by the formula (3.39) for (x,s) ∈ C 1,3 .To determine the value function (2.4) in C 2,2 , where clearly the control µ 1 is to be applied, we can use the result of Step 2 above which leads to the following analogue of (3.24) above: Finally, in C 3 we should also apply the control µ 1 , and thus the result of Step 3 (the case of "small" c ) above can be used.Due to (3.44), the expressions (3.29)+(3.33)carry over unchanged to the present case.It must be kept in mind, however, that s * satisfies (3.44) and not (3.12).This completes the construction of a solution to the system (3.3)-(3.6).
(5) In this way, we have obtained a candidate for the value function (2.4) when (x,s) ∈ C in both cases of "small" and "large" c.The preceding considerations can now be summarized as follows.