Optimal Control of Stochastic Dynamic Systems of a Random Structure with Poisson Switches and Markov Switching

Institute of Physical Engineering and Computer Science, Yuriy Fedkovych Chernivtsi National University, 14 Rivnenska Street, Chernivtsi 58000, Ukraine Department of Pedagogy, Psychology and Education Management, Institute of Postgraduate Pedagogical Education of Chernivtsi Region, Chernivtsy 58000, Ukraine Department of Mathematics and Informatics, Yuriy Fedkovych Chernivtsi National University, 28 Universytetska Street, Chernivtsi 58012, Ukraine


Introduction
Systems with Markov parameters belong to an important class of systems describing processes of the rapid changes in states that occur in many cases, for example, in industry, in queuing systems [1], in ecological systems [2], in economics and finance, and in modeling of a microgrid [3]. ese systems are mathematical models of the hybrid dynamic systems, in which one part of the variables changes continuously and the other part changes discretely. e most used hybrid models are models described by differential equations. is article focuses on this area of research.
Let us pay attention to the fact [4] that such abrupt changes in the systems are discrete events and are assumed to be modelled by a Markov chain taking values in a finite value set. Practical motivations as well as many theoretical results for the Markovian jump system can be found, for example, in [5][6][7].
e linear system of the following form is considered in [4]: _ x(t) � (A(ξ(t)) + ΔA(ξ(t))x(t)) +(B(ξ(t)) + ΔB(ξ(t))u(t)), t ≥ 0, Here, x ∈ R n is a state vector, ξ is the Markov chain with finite number of states, A, B are defined constant matrices for each fixed ξ, ΔA, ΔB are indefinite matrices that satisfy the specific conditions described in the following, and u ∈ R m is a control. In this case, the problem of the synthesis of optimal control with constant inverse communication for such systems is solved.
In [2], the optimal control problem for the Ito linear stochastic system with the uncertainty of the following form is solved: Here and in the following, w(t), t ≥ 0, is a Wiener process.
( 3) e nonlinear case of the Ito stochastic system is considered in [4]. In this paper, the sufficient optimality conditions are obtained in the case of an infinite horizon for the system In [7], optimality conditions were obtained in the case of an infinite horizon for a linear stochastic system with finite aftereffect and Markov parameters:

(5)
A great contribution to the development of systems with Markov parameters was introduced by Katz [8]. In this work, a Markov process can be either a Markov chain or a continuous Markov process. In addition, it is advisable to consider the case of perturbations of the impulse type, this is when, for example, the moments of time in which the discontinuities of the phase trajectories of the process are possible, and they are known in advance. For deterministic and difference systems, this situation was studied in detail in [9].
Stochastic systems of random structure according to Katz [8] with impulse Markov perturbations according to Tsarkov were considered in [10][11][12][13]. Stability in different probabilistic senses has been investigated, and the problem of optimal stabilization, solution of which is the control stabilizing the system to stochastically stable, is solved. In these papers, the absence of perturbation points was assumed, but in [14][15][16], the existence and uniqueness of the solution of a system of differential-difference equations with Markov parameters and switching in the presence of perturbation points were proved. erefore, we can consider the problem of stability and optimal control for such systems.
Furthermore, in the theory of control of classical problems, there are problems of constructing the control that minimizes a certain quality functional. is problem is wider than the problem of optimal stabilization because it does not include stabilization of the solution x (t). e monograph [17] is one of the main papers in which the general theory of controllable processes and the theory of controlled stochastic differential equations are presented. Here, the classical problems of optimal stopping of random processes are considered, a rigorous proof of the derivation of Bellman's equations [18] is presented, and applications of Bellman's equations for constructing optimal controls are considered.
Controllable systems with finite aftereffect of the deterministic and stochastic type are considered in [19]. e paper [20] is devoted to the synthesis of optimal control of linear stochastic dynamic systems with finite aftereffect and Poisson perturbations. e general form of the Bellman equation and the Bellman functional for a stochastic dynamical system of a random structure with Markov switching was obtained in [21], which is a sufficient condition for the existence of optimal control, and the problem of synthesis of optimal control for such systems was solved in [22]. e switching times of Markov processes in [21,22] are assumed to be known; however, in many cases, abrupt changes in the trajectories of the process at random moments are also possible. Such situations are adequately described by taking into account the term in the equation of motion, which is the integral of the Poisson measure. Stochastic systems with semi-Markov switching were introduced in [23]. In this paper, the conditions of weak convergence of semi-Markov random evolutions to the diffusion process are considered. Also, sufficient conditions for the stability of prelimited processes are discussed. In the articles [15,16], Tsarkov et al. discussed the influence of Markov perturbation on the solutions and applied stochastic differential equations for applied problems.
is paper is devoted to the synthesis of optimal control of stochastic dynamic systems of a random structure that are under the influence of impulse Markov switching at known moments of time with allowance for Poisson perturbations that allow describing discontinuities of trajectories at random moments.

Main Result
Consider the stochastic random structure system given by the Ito stochastic differential equation with Markov switching and initial conditions [24,25]; and the processes w, ], ξ, and η are independent [24,25] and defined in the probability basic (Ω, F, F t t ≥ 0 , P). e trajectories of process x(t), t ≥ 0, belong to the Skorokhod space D [17], control are measurable functions which satisfy the conditions for the existence and uniqueness of a solution of problems (6)- (8). An example of the existence and uniqueness conditions is the Lipschitz condition and linear growth condition [17].
and define the class of the functions: is defined on the functionals v k (t, x) ∈ V. Here, . e optimal control problem is to find a control u 0 k , k ≥ 0, from the class U that minimizes the functional of quality [19]: For some fixed It is necessary to prove some auxiliary properties in order to obtain sufficient conditions of optimality. Let us consider the main statements about properties of the infinitesimal operator.
hold, where Lv k (t, x) is the WIO defined by (10). x, u(s, x))ds , t ∈ t k , T .
Proof. Consider the solution x(t) ∈ R m of (6)-(8) for t ∈ [t k , T], which is constructed according to the initial condition.

Journal of Mathematics
Let us integrate (15) with respect to t from t k to T and calculate the expectation. We obtain According to Lemma 1, the first term in (18) exists, and it is equal to increment (12): (16) and Substituting this equality into (18), we get the statement of Lemma 2.
en, the control u 0 k is optimal, and for ∀t ∈ [t 0 , T], e sequence of functions v k (t, x) is called the cost, or the Bellman function, and equation (21) can be written as the Bellman equation: Proof. e optimal control u 0 k ∈ U is an admissible control. So, there is a solution x(t k , t 0 y, h, u 0 k ) for which (21) takes the form u 0 k is chosen at (t, x(t, t 0 , y, h, u 0 k )), t ∈ [t k , t k+1 ). Let us integrate (26) from t to T, obtain the expectation, and, taking into account (22) Now, let u k � u k (t) be any other admissible control from U. Then, by condition (3), the following inequality holds: for Let us integrate (28) with respect to τ ∈ [t k , T] and obtain the expectation E with fixed τ and initial value x. Taking into account Lemmas 1 and 2 and boundary condition (22), we obtain v k t, x t, t 0 , y, h, u 0 is is the definition of optimal control u 0 k (t, x) in the sense of the minimization of the functional I u (t, x). e theorem is proved.

General Solution of the Problem of the Optimal Control.
According to [10,11,26], WIO (10) has the form i,j�1 , k ≥ 0; « T » is the sign of transpose, Sp is the trace of a matrix, and p ij t, (z/x) is the conditional density of distribution By assumption, ξ(τ − 0) � y i and ξ(τ) � y j . e first equation for v 0 k (t, x), k ≥ 0, can be obtained by substituting (30) in (21). en, the required equation takes the form e second equation for optimal control u 0 k (t, x) can be obtained from (32) by differentiating with respect to u because u � u 0 k , k ≥ 0, gives a minimum of the left side of (32): (za/zu) is m × m Jacobian, which consists of elements (za n /zu s ), n � 1, m, s � 1, m ((zb/zu)-similarly), and (zG/zu) ≡ ((zG/zu 1 ), . . . , (zG/zu r )), k ≥ 0.
Solving systems (32) and (34), even in the presence of computing technology, is quite difficult. So, it is advisable to consider a simplified view of problems (6)- (8) and (10), such as a system with a quadratic quality functional. e following paragraphs will focus on such problems.

Optimal Control of Linear Stochastic Random Structure
Dynamic Systems with Markov Switching. Let us consider the problem of optimal control for the stochastic dynamic system given by the following stochastic differential equation:

dx(t) � [A(t, ξ(t))x(t) + B(t, ξ(t))u(t)]dt + σ(t, ξ(t))x(t)dw(t)
Journal of Mathematics with Markov switching and initial conditions Here, A, B, σ, and C are piecewise continuous integrable matrix functions of appropriate dimensions. e optimal control problem for systems (35)-(37) is to find the control u 0 ik , i ∈ 1, . . . N { }, k ≥ 0, from set U of admissible controls such that it minimizes the functional of quality: To simplify, let us denote (39) Theorem 2. The optimal control for problems (35)-(38) has the form where the nonnegative definite m × m matrix P ik (t) ≔ P(t, ξ(t), η k ) belongs to the Bellman functional: Proof. e Bellman equation for (35)-(37) has the form Substitute (43) into (42): e expression for the optimal control can be obtained by differentiating (44) because u ik (t, x) � u 0 ik (t, x) minimizes the left side of (44): where So, e theorem is proved.

Construction of the Bellman Equation.
Substituting (40) and (41) into (42), we obtain the following equation for ∀t ∈ [t k , t k+1 ): Equating the quadratic form with respect to x and expressions that do not depend on x to zero and taking into account the matrix equality 2x T P ik A i x � x T (P ik A i + A T i P ik )x, we obtain the system of differential equations for finding matrices P ik (t), t ∈ [t k , t k+1 ), i ∈ 1, . . . N { }, and k ≥ 0: with boundary conditions us, the following theorem holds.
Furthermore, the problem of the existence of solutions of problems (49)-(51) must be solved. Let us use the Bellman method of iterations [18]. For simplicity, consider the interval [t k , t k+1 ), where ξ(t) � y i , and omit all indices «ik» in u, v, and P. First, we define the zero approximation: where P 0 (t) ≥ 0 is the bounded and piecewise-continuous matrix. Let us substitute (52) into (34) and calculate the value of v 1 (t, x) for the resulting equation, which corresponds to the control (42). en, substituting v 1 (t, x) into Bellman equation (42), the control u 1 (t, x) which minimizes (42) will be found. Continuing this process, one can construct a sequence of controls u n (t, x) and functionals v n (t, x) of the following form: where P n (t), t ∈ [t k , t k+1 ) is the solution of boundary value problems (49)-(51) for T ≔ t k+1 . e following estimate is valid for ∀n ≥ 1: e convergence of functionals v n (t, x) to v 0 (t, x), the convergence of controls u n (t, x) to u 0 (t, x), and the convergence of a sequence of matrices P n (t) to P(t) can be proved by using (54) [19,20]. e following estimate is valid: us, the following theorem holds. (1) ξ � 1, A 1 (t) � −1, B 1 (t) � 1, σ 1 (t) � 0.1, and C 1 (t, z) � z 2 (2) ξ � 2, A 2 (t) � −2.25, B 2 (t) � 1, σ 2 (t) � 0.2, and C 2 (t, z) � z 3
e results of simulation of two trajectories of the random process x(t) are shown in Figure 1.
is figure shows two trajectories of solutions with a positive (blue line) and negative (red line) initial condition x 0 . As we can see from this figure, the solutions are stabilized because the optimal control satisfies the conditions of eorem 1 with the quadratic functional. punov function method for investigation of stability of stochastic Ito random-structure systems with impulse Markov switchings. I. General theorems on the stability of stochastic