Dynamic System Evolution and Markov Chain Approximation

In this paper computational aspects of the mathematical modelling of dynamic system evolution have been considered as a problem in information theory. The construction of mathematical models is treated as a decision making process with limited available information.The solution of the problem is associated with a computational model based on heuristics of a Markov Chain in a discrete space–time of events. A stable approximation of the chain has been derived and the limiting cases are discussed. An intrinsic interconnection of constructive, sequential, and evolutionary approaches in related optimization problems provides new challenges for future work.


INTRODUCTION
Many mathematical problems in information theory and optimal control related to dynamic system studies can be formulated in the following generic form.A decision maker (DM, i.e. problem solver, modeler or observer) receives information about a system from observations, measurements, or computations in the form of a data stream that can be formalized mathematically as a sequence (xo, x,...). (1.1) We assume that such a sequence has at least two elements and that each element of the sequence E-mail: melnik@usq.edu.au. is labelled by its own time t.Hence, referring to the element xt of the sequence, we assume that the total amount of information about the system that corresponds to the time interval (0, t) of its behaviour has been received, or at least can be received in principle.Under the above assumptions we can introduce a set Tt of permissible strategies for each time t.Then, observing the sequence (x0,...,xt), the decision maker can choose a strategy that is defined by the inclusion st UT. (1.2) -=0 Typically we reduce the problem of constructing a map between elements xt and st defined by (1.1), (1.2) to a simpler problem allowing the set of permissible strategies for all times of consideration to be fixed and to be given a priori.Namely, we can idealize actions of the decision maker as follows.We can assume that the DM can select a strategy st at each time from a given set Uv.Of course, the validity of such a simplification ultimately depends on the Axiom of Choice excluding the logically possible case of incomparability of two arbitrary sets that correspond to two different times and [51,32].However, on the other hand, such a simplification permits the development of a set-theoretic approach to dynamic system evolution, and simplifies the mathematical formalizations of complex optimiza- tion problems.In fact, we can introduce a loss function l(.,.) as a function of two variables, states xt and strategies st, which are both characterized by the same time t.A desire to minimize time- averaging characteristics of this function can be formalized through the optimization problem F(1) min, st E U. (1.3) Here, the objective functional F may be, for example, the Cesaro-type sum l(x k, s-k ), ( F(1) -k=0 where -kE(0, T) Vk(0,1,...,n) and T is as- sumed to be given.The limiting problem in the spirit of classic ergodic theorems arises when we investigate the limit behaviour lira F( T-oo with F(1) given by (1.4).Objective criteria may also be formulated in an integral form.For example, for the Boltz problem in optimal control theory we have the form of the functional in (1.3) F(1) g(xr) + fo('r,x.,s.) d-xr K, (1.5)   where T (0, oo) is assumed to be given, and K is a given target seta.We can also consider a class of problems with infinite time horizon using dis- counting cost procedures.All these examples provide important partial cases of the general problem (1.1), (1.3).
Of course, to complete the formulation of the problem (1.1), (1.3) mathematically, we have to specify in what sense the sequence {xt} in (1.1)   should be understood.One possible specification can be provided by an assumption that xt may be appropriately described by a given stationary ergodic distribution.Then a typical assumption imposed on functions st from U;r is Lebesgue- measurability on the interval (0, t).Under the above mentioned assumptions, associated theore- tical issues are often addressed using the theory of Markov processes [19].Starting from the work of Bellman [5,6], the theory has been extensively developed, and a number of efficient algorithms have been proposed.Discrete dynamic program- ming ideas have been essentially generalized for the continuous case during the past decades [18,19], and many new results that appeared recently indicate the continued research interest in these topics [19,35].It should be noted, however, that many results in this area rely (explicitly or implicitly) on the assumption that a measurable function of strategies st Uv may be effectively approximated using past states xt,, 0< tt< t.If such an assumption is made, the attainability of the minimum in (1.3) becomes the subject of a corresponding smoothness assumption on the loss function [42].On the other hand, regularity of this function is strongly dependent on complete infor- mation about the past states, and eventually on model data and parameters.Since the initial data for the model can only be known approximately, the whole stream of information available to the decision maker at time can be interpreted, at best, as an approximation of system dynamics.The quality of such an approximation at time is defined by the "informational completeness" of The functions f0 and g are called running and terminal costs respectively.If f0--0 we have Mayer's problem whereas for g 0 the problem is referred to as the Lagrange problem.
the data stream (so, xo;sl,Xl; .st,,xt,;.)(1.6) when tt t.To complete the step corresponding to time in this process, one can assume that the strategy st may be chosen from the same set UT.
Then, the next stream element xt may be received with a given accuracy, at least in principle, if we also assume that element x0 in (1.1) may be given with infinite precision.Of course, in the reality of mathematical modelling the latter assumption can- not be rigorouslyjustified [45].However, ifstrategies are chosen at each step to satisfy a certain sub- goal, the described process provides the possibi- lity of evaluating the quality of satisfaction of a subgoal that corresponds to time t.If the process is finite then we can refer to the last subgoal as a top-level goal [33].The latter can be satisfied by satisfying subgoals at each step appealing to multicriteria analysis of the underlying problem.
The main problems in such analysis stem from the coupling of the sequence of subgoals to the definition of the top-level goal in the form of a functional of the loss function l(.,.).Mathemati- cally speaking, we should be able to define a mapping between fixed-time subgoal functions and an averaged-time goal functional.Such a definition is closely connected with the definition of optimal strategies which we do not know a priori.
However, if it is known that st E Ur, then it is reasonable to choose strategies based on knowledge not only of time t, but also on states xt.If we assume further that xt "accumulates" all past information about the system, then the concept of a Markov Chain comes by itself.Because of uncertainty in knowledge base (1.1), such an accumulation cannot be understood in a purely deterministic way [8].The origin of such uncer- tainty is induced by the strategy So in the data stream (1.6).However, mathematically such uncer- tainty can be formalized if instead of constraints (1.2) we consider "relaxed" constraints st UT, (1.7) assuming that the set Ur is given a priori for the whole time-set of interest.Then, instead of the data stream (1.6), we can consider an informationally reduced stream: (x0, x,,);...), (1.8) where all strategies satisfy the constraints (1.7).An additional assumption of continuity of the se- quence (1.1) in time allows a convenient mathe- matical framework for justification of models based on an approximation of (1.6) by (1.8).Such a classical idealization of temporal evolution by continuous trajectories of phase points, induced by classical mechanics, can be applied only within certain limited contexts, and involves serious difficulties in many areas of mathematical modelling.The main problems are caused by the fact that there are many dynamic systems for which arbitrary close initial conditions can give rise to qualitatively distinct (including exponentially di- verging) types of trajectories [45].Such strong trajectory instability requires other approaches in the description of dynamic system evolution.Under a probabilistic approach, deterministic invariance of phase points along trajectories is replaced by the invariance of the density along trajectories.Physically, such a "conservation of extension in phase" (due to J. Gibbs [37]) eventually requires a construction of Gibbs dis- tribution functions using a probabilistic description of states.Mathematically speaking, this problem can be seen as a problem of a "closure" of the reduced informational stream (1.8) with respect to all possible states.Such a closure can be performed if we assume Lebesgue integrability of the function -cologco, co > O, r/(co) 0, co-0, (1.9) over the set of all possible states, where co=f(t, xt) is the density function.From an information theory perspective, this logical step, which in the end requires answering the question of system stability, is equivalent to a transforma- tion from the classic Shannon entropy [53,49] to the Boltzmann-Gibbs entropy [37].Under such a transformation we formally identify a (thermo)dynamic system with a measure space [37].If N is   fixed and the measure is defined as a Lebesgue measure, then for any time-set (0, T) (including the possibility of T-) the validity of the above transformation requires an a priori assumption of lower semi-continuity [55] of the recursive function (1.10) as a function of density, where a theoretical possibility of n is permitted.If we assume that such a function exists, then in principle, the only possible uncertainty in the model (1.3), (1.8) for any T is induced by the definition of x0 and (f(x, T)).Such is indeed the case in optimal control theory where the recursive function plays the role of the value function.In fact, if we know a priori that the top-level goal can be described appropriately by a continuous function F(l), then the associated optimal control problems can be studied through a nonlinear backward evolution PDE known as the Hamilton-Jacobi-Bellman equation with Cauchy-type terminal conditions ( [11,19] and references therin).If an algorithm for the numerical solution of the latter problem exists, it can in principle be represented in the form of the informational stream ((x,S); (x_,,Sx_,);... (x,,Sx,);...), (1.11) when 0 + and At > O.The main theoretical dif- ficulty in the rigorous justification of algorithmic rules constructed according to (1.11) is the existence of the limit of s,, when 70+.If we assume that such a limit exists, then we should be able to evaluate the quantity so lim Sxt (1.12) t0 + on the basis of xT (which is assumed to be given) and some logical rules.In reality, the recursive function of density (1.10) at a fixed moment of time may be given only approximately.Such an approximation defines a degree n of the underlying recursion (1.10), and in turn defines a basic structure of a finite lattice on which the system dynamic can be approximated [14].
Hence, in general, information on an approxima- tion of the same dynamic system can be provided in two possible ways: using the sequence (1.1), and using a subsequence of (1.11) that is Due to intrinsic uncertainty in the definitions of x0 and CxT, neither of these approximations con- sidered separately from the other can guarantee the adequacy of the approximation to the real system.
However, we can draw certain conclusions on the system dynamics by analysing both of the sequences simultaneously.The complexity of such analysis is due to the necessity of a coupled investigation of the same system in two different scales.Mathematically, such scales are induced by the two limiting types of system behaviour with respect to the time-component: to and At--+ 0+.They are connected by the definition of the recursive degree for the system density, and ultimately, on the definition of the top-level goal in (1.3).Splitting up such a goal into subgoals provides an efficient method for the analysis of the system dynamics.In turn, such analysis gives a way to derive a sequential approximation of the system Hamiltonian, ensuring a stable model of system dynamics.The remaining part of the paper is organized as follows.In Section 2 basic preliminaries are recalled for the formulation of optimal control problems as problems in information theory.
Section 3 is devoted to consideration of determi- nistic and stochastic dynamic rules.Examples are given to show that if such rules are specified, then an informationally consistent formulation of con- trol problems requires an analysis of system stability.Section 4 deals with deterministic and probabilistic algorithmic machines and analyses problems involved in their application.Section 5 gives a link between the questions discussed in the previous sections and discrete optimization problems using their common physical and informa- tional basis.In Sections 6 and 7 mathematical models are constructed and computational models derived to analyse dynamic system evolution using the Markov Chain approximations.A stable ap- proximation for the hyperbolic model is obtained and the algorithm has been given.Computational aspects of Discrete Markov Decision Processes (DMDP) are discussed in Section 8.The main conclusions are summarized in Section 9.

PRELIMINARIES
Let us define the state space of the system by E and the Borel a-algebra induced b by E as B(E).Then, no matter what the time-partition in [0, t) is, 0 _< -1 < -2 < < -n < -, -(0, t), we assume that VXB: P(x XIXTI,... XTn) P(x X XTn) (2.1) almost surelyc.That is, the data stream xt under the strategy of the time partition has the Markovian property.Of course, continuity of the data stream xt in does not follow from the condition (2.1).
Furthermore, even if xt is a continuous function of time, it does not, on any account, mean that strategies form a continuous function of time as well.In general, we have a multicriteria optimiza- tion problem induced by the partition of time and the analysis of the sequence (1.6).However, the difficulty in evaluating the limit (1.12) prompts several ways to further simplify the problem.One of the direct ways is to assume a priori continuity of the sequence (1.1) in time.Then we can reformulate the multicriteria optimization problem arising in analysis of (1.6) as an optimal control problem (1.3) with respect to a continuous function of time F(1) and some dynamic rules that define the sequence (1.1).Alternatively, we can analyse the sequence (1.6) using DMDP.The theory of DMDP is well- developed under the assumption of the possibility of complete information in (1.6).During recent years new challenging problems have stimulated further development in the theory of DMDP [34,25,17].In brief, one of the most interesting problems in this field is induced by the question of data perturbations in the informational stream (1.6).Indeed, when perturbations of a Markov Chain change its ergodic structure, the stationary distribution of the perturbed system may not be a continuous function [52,1].Hence it is reasonable to assume that system dynamics depend on some parameters of the Markov Chain and due to the imprecision of available information we can study system dynamics using in general Singularly Perturbed Markov Chains (SPMC).In this frame- work evolution of a system is coupled to its Markov Chain parameters.An example of this type DMDP was provided in [13] where non- diffusion stochastic models were studied.We assume that in general the parameters of the Markov Chains are allowed to jump, and the jumping rates may be dependent on the state function xt.The corresponding systems described by x7 at timeare called piecewise-deterministic stochastic systems.Such systems have been extensively studied during recent time by theoretical physicists [29], and indicate growing interest in hyperbolic dynamic rules of nature [46,30].
Mathematically speaking, we define a finite-state Markov Chain #7 with the state space .AA.The chain is regarded as a parametric process for the dynamics of the system which is described by a state function x7 and a parameter #7.The param- eter #7 may undertake a jump on the interval (0, t) at times -1 <... < -n, and the jumping rate is a function of time -, state of the system xT, the "before-jump" value of the parameter #l and the "after-jump" value of the parameter #2 of the Markov chain.Hence we define a function of jump rates as def ] =]-,x,#1,#2).
(2.2) It allows us to regard the process (xT,#0 as a Markov process with the state space 3 E (R) A4. b The least a-algebra that contains all open subsets of E. With respect to corresponding a-algebra [19].
It should be emphasized that the system itself x may not have Markovian behaviour.Thus, diffi- culties arise in constructing a mapping that relates the function (2.2) to states x of the system.
Ultimately, such difficulties stem from the problem of mathematical formalization of the concept of perturbations, which are usually regarded as a small and external-to-the-system source.Of course, in the real world modelling, statistics of the source is unknown a priori, which precludes assumptions based on an e-additivity of perturbations.In general, such assumptions may not be adequate for the transition law of the Markov Chain as well as for the Hamiltonian of the system as a whole.

DYNAMIC RULES AND CONTROL PROBLEMS
Eventually, due to the approximate character of available information about the informational stream (1.6), any mathematical model can provide at best a description of a perturbed rather than an unperturbed dynamic system.Hence, if the math- ematical model of a dynamic system has been constructed, in derivation of a computational algorithm we should adapt the choice of strategies st in our approximation of (1.6) to the character of such perturbations.Another way of putting it is that the model and the algorithm should be informationally consistent, reproducing the infor- mational stream (1.6), and giving an approxima- tion with a reasonable degree of accuracy.

Differential Equations and Inclusions
To include the possibility of perturbations into models let us start from the definition of a mapping f(t, xt, st) T(R) (R) Ur 7, ( where T is a given set of time.When xt is assumed to oe continuous the dynamics of a deterministic system can be appropriately described in almost- everywhere sense by the differential equation x --f(t, xt, st), xlt=0 x o, st E Ur, (3.2)   where x is an element of a given set X defined as an e-neighbourhood of an idealized point x0.In general, the mathematical model (3.1), (3.2) can provide a description of a perturbed rather than unperturbed dynamic system.This is the case even if we formally exclude st from the right-hand part of the model or introduce some optimizing criteria.
The next example is to demonstrate the possibility of instability in the perturbed model under any arbitrary small level of perturbations.
Example 3.1.Let us analyse unperturbed and perturbed dynamics of a homogeneous linear system: (a) i Ax, (b) i Ax. ( Here we assume that the matrix A is given and A,=A +A, whereas IlZXll _<lI/ll is the absolute error for perturbations of the matrix elements.If we assume that the initial conditions for the model (3.3) may be given precisely, then the problem of stability for the model is equivalent to the investigation of the e-spectrum of the original matrix A. The e-spectrum of a matrix is defined as the union of all spectra of perturbed matrices for a certain level of error [23].In general, for any arbitrary matrix A there exists a special connec- tion between its spectrum and its resolvent under e-perturbations.The problem consists of the fact that without restrictions on e, an absence of practical dichotomy can be anticipated.More precisely, there might exist such e =e() that A with [JAIl < e can have in the left-half plane the number of eigenvalues different from the number of points of the matrix A spectrum.If the matrix A is defined as -1, j-iVi-1,2,... ,20, A (ao.) We note that Example 3.1 deals with the perturbation of the right-hand part of the model, but not with the initial condition.The latter was assumed to be fixed for both perturbed and unperturbed models.The idea of "frozen" initial conditions for a family of the perturbed right-hand parts leads to the mathematical models in which dynamic rules are defined by differential inclusions.In fact, on the basis of the point-valued map f, we can define a set-valued map [2,19] .T(t, xt) def{f(t, Xt, St)}, where st is assumed to be defined by another set-valued map.Of course, the set-valued map for the definition of st is coupled to the definition of the optimizing functional F(l) in (1.3).Hence, when describing dynamic rules by the differential inclusion xt .T(t, x,) (3.4) in an almost-everywhere sense, a family of per- turbed mathematical models (1.3), (3.4) defines an optimal control problem.In the models of this type we have a natural contradiction.On the one hand, the quality of this model has to be defined with respect to the stability of the system dynamic.On the other hand, such stability depends on the definition of st, which is an unknown function in the mathematical model.Hence, eventually the quality of the model depends on the definitions of the mapping (3.1) and initial conditions.In the end such definitions depend on the problem of evaluat- ing the limit (1.12).If the initial conditions of the model are fixed, then an example of instability for the mapping (3.1) may in principle be constructed for any specified sequence st.This type of insta- bility is usually referred to as computational instability.Example 3.1 clearly shows that the- oretical issues of stability should primarily be addressed if "precise" initial conditions are as- sumed.In optimal control theory we do not require the sequence st to be specified explicitly, and therefore, the problem of the model stability can be formally circumvented by some appropriate regularity assumptions on the mappings c and F.
The remaining theoretical problem is to prove that if the mapping (3.1) is well-defined then so E Ur, where So is defined by the limit (1.12), whereas x0 may not be given precisely.The complexity of this problem led to the constructing mathematical models of optimal control using recursive functions of density (1.10).In theory such approaches require analysis of a subsequence of (1.11) that consists of the values of the recursive function , (3.5) when t--+ 0+.Such analysis is typically performed for At--0+, and essentially uses the assumptions that x0 and xr in (1.1), (3.5) may be given either precisely, or at least with equal probabilities.
st UT (3.9) The rigour in mathematical justifications of the models (1.3), (1.5), (3.1), (3.4) and (3.6)-(3.9) is grounded in the following logical rule.Provided x0 is given precisely, the forward-evolution model (1.3), (1.5), (3.1), (3.5) can be studied through the backward-evolution model (3.6)-(3.9)for any given function g from a specified topological space.The definition of topology for such a space requires the definition of a set in which physical states of the system can be embedded.Mathema- tically, the problem is usually considered with respect to Euclidean spaces (either finite dimen- sional [19] or infinite dimensional [28]).It allows us to use the logical rule in the reverse order: provided g is specified in a topological space, the backward-evolution model can, in principle, re- cover the forward evolution of the system for any given initial condition x0.
We note that the definitions of x0 and g are cou- pled to the definition of the system Hamiltonian by the specification of a topological space.An assumption that the topological space satisfies the Hausdorff separability axiom allows us to com- plete the chain of logical arguments in the mathematical justification of the original optimal control problem.The only problem remaining with such reasoning is that of system stability.This question is associated with the question of stability of measures defined with respect to the system's state-space, which is typically a priori assumed to be Hausdorff.Formally, this assumption corre- sponds to the choice of such a function in (1.10) for which n --, oc.Therefore, eventually the quality of the backward-evolution model (3.6)- (3.9) depends on the definition of a set X from which we "puncture" a point x0 when e 0+.In the end, the question is reducible to the existence of an optimal strategy So for such an operation, and evaluation of the limit (1.12).Since such a strategy is known neither with a deterministic certainty nor with the probability 1, it is reasonable to estimate the quality of the backward-evolution models with respect to a set X, where e may be small, but always assumed to be positive.Then the model (3.6)-(3.9)cannot be considered other than a perturbed mathematical model.Since e > 0, the instability of the system can be anticipated, unless the strategies from the set UT are chosen consis- tently with the states of the system from the set N. Such consistency is defined by the definition of the system Hamiltonian in a chosen topological space, which is eventually defined by the mapping (3.1).In this sense the Hamiltonian can be regarded as a higher degree recursion of this mapping.Since the functionf(t, xt, st) may be discontinuous in general, so may the Hamiltonian function, unless it can be represented as an infinite degree recursion off.The assumption of positiveness for e precludes such a situation, which seems to correspond to all physically conceivable situations.However, it implies a hyperbolicity in the underlying mathematical model [46,30].The hyperbolic nature of mathema- tical models in optimal control theory stems from the splitting of the informational string (1.6) into two: (1.1) and (3.5).A simultaneous consideration of these strings implies their approximation by the perturbed informational strings (x,x, ,xt,, .), ( ((r'(r-x,'""(t)" (3.11)After the approximation, neither of the two equalities lim lim x, lim lim xt,, e--0 + t-T t/ T e-0 lim lim ( lim lim ( (3.13 can be guaranteed in general.The lack of equalities in (3.12), (3.13) is caused by possible singularities in transformations from So to x and from ST to .Nevertheless, for any arbitrary e>0, the informational string (1.6) can be eventually approximated as e'x "S1 X1; x" "x[" when --+ t, V E (0, oo).Hence, the quality of approximating (1.6) by (3.14) is defined by the sequential character of approximation for the function if, which in optimal control theory plays the role of the value function that depends on an approximation of the system Hamiltonian (or Lagrangian).

Stochastic Rules
Let us consider a dynamic system described in terms of the stochastic differential equation where f and r in (3.15) denote drift and diffusion terms respectively, and w is a Wiener process.As a functional F in (1.3) we choose F(l) Etx fo('r,x-,s-) d'r + g(x(T)) (3.16)Then the problem is to find infF(/), Ur where F( 1) is defined by (3.16) under the dynamic rules (3.15), and (3.17) provides a typical example of a stochastic optimal control problem.The use of the Bellman's principle can formally reduce the problem to the dynamic programming equation The definition of the value function in (3.18) is analogous to that in (3.7) when we consider the conditional expectation of the performance mea- sure (3.6).Note also that in the Eq.(3.19) the linear operator of backward evolution A is well-defined exists for each x E and Ic [0,T], except of t--T itself.In the end, the existence of the limit (3.20) is subject to the definition of V(0, x0).As in the deterministic case, such a definition depends on the definition of a set X, and thus eventually requires the definition of So.To put it differently, for a justification of the limit in (3.20) we need existence of two limits induced by (1.10) and (3.11),The latter may be assumed a priori rather than justified rigorously.However, even under such an assumption the procedure of transformation from the model (3.15)-(3.17) to the model (3.19), (3.20) remains an essentially sequential heuristic procedure.
The heuristic nature of the model (3.19), (3.20) can be circumvented by using the diffusion approximation method for the original optimal control problem (3.15)-(3.17).As a result, we arrive at the form of HJB equation: where the Hamiltonian H is defined as H(t, xt, 5, l-I) de__f sup St UT tr[Tr(t, xt 5)II] -fo(t, xt, 5)}.

(3.22)
Here 7r--crcr t, and II is a symmetric nonnegative definite matrix (for details, see [19]).Note that a reduction of the problem (3.15)-(3.17) to a partial differential equation by the rescaling of a Markov Chain is accompanied by a loss of information about the dynamic system itself.Indeed, the original dynamics xt intrinsic to the model may or may not be Markovian in general.Though the Markovian property has to be preserved for the process (st, xt), it may be violated after the rescaling procedure, which requires a conservation of the Markovian structure from xt.

General Rationale for the Optimization of Singular Perturbed Dynamics
For all described dynamic rules, regularities of mappings that define the Hamiltonian of the system and the value function are coupled by a specific mathematical model, and eventually depend on the topology of the space (in which investigation of the model is being conducted) and the initial conditions of the model.In principle, a priori regularity assumptions on the Hamiltonian allow the recovery of information about the regularity of the sought-for solution.Results of this type provide a rigorous mathematical justifica- tion of the models for which the form of the Hamiltonian is specified.During the past years the theory has been extensively developed in this direction for deterministic and stochastic optimal control problems (see [11,47,28,19] and references therein).
Since the Hamiltonian of the system can be given only approximately, whereas regularity for the sought-for solution is not a priori knowledge being the subject of our assumptions, it seems to be reasonable to couple the model and algorithm for its solution using an approximation of the informa- tional string (1.6).Mathematically speaking, we do not assume a priori "smoothness" of the "transi- tion" between st, and xt for an approximation of the informational stream (1.6), even if c 0 + It implies a consideration of singular stochastic problems in which the function x is allowed to be discontinuous (the first problems of this type were studied in [3,4]).In general, since a "transition" between st and x (TE(0, )) may be discontin- uous, we cannot use the principle of smooth fit (see [54] and references therein) to claim continuity of the recursive function of density t when tT (possibly T ).If our objective is a possibilistic attainability of the following limits lim x xt, lim fft-x,, (3.23)   e-0 e--+0 + then regularities of the limiting functions xt and x, become subject to our a priori assumptions, which in turn bring the possibility of singularities in such dynamic processes as "strategy-state" (st, xt) and "strategy-state-density" ((st, xt); x,).It reduces the problem of analysis of the sequences (1.1) and (3.5)   to the analysis of the perturbed informational strings (3.10), (3.11), which formally allows us to include the parameter of perturbation e into the model.We can assume, for example, that the dynamics of the system can be effectively described by "fast" and "slow" components [59]: ,e =f, (z,, y,, t, ,) (3.24) def Je g(yr, zr) + fo(r, y.;, z s.) dr, (3.25) then the problem (1.3), (3.24), (3.25) is an optimal control problem for the singular perturbed dynamics.In general, neither Yt nor zt are required to have the Markovian property.The role of the string (st, xt) in this case plays that of the sequence (st, (yt, zt)), in the sense that the sequence (yt, zt) is dependent on Markov Chain parameters, and thus the whole process (st,(yt, zt)) can be seen as a Markov Chain approximation.We can also inter- pret the sequence (yt, zt) when e0 + as the definition of a recursive function of density with increasing degree of recurrence as n-* Then the model (1.3),(3.24),(3.25)will be well- defined if we define a set X of initial conditions with a specified level of error.Hence, as above, the definition of the pair (y0, z0) is eventually depen- dent on the definition of So in the informational string (1.6).It implies an approximation of the informational string (1.6) induced by singular dynamic rules using sequential decision schemes.

ALGORITHMIC MACHINES
Probabilistic Finite-State Finite-Action Machines

Under Singular Perturbation
First, let us consider a probabilistic finite-action machine that analyses a Discrete Markov Decision process.Mathematically, the analysis can be formalized as a set of four-tuple def M x x; u; ---(x,); defp p,, (x'--xt, l(xt 5,)), x' EX, t' > (41) where p, is the perturbed probability of the tt transition from the state xt to the next state x', is an immediate reward, H is a finite set of actions, X is a finite set of states, and T is a set of all times for which states from X are realizable.In general, the disturbance law of the transition probabilities in (4.1) is not known a priori.We may assume, however, that p(x'l(xt,,)) p(x'e[(xt,t)) 1. (4.2)   x' E X t' E T We also observe that every strategy st induces a perturbed P rather than an unperturbed transi- tion matrix.Hence, assuming the flow of time ad-infinitum, we can define the Cesaro-type limit where 0 _< 7-1 < 7-2"'' 7-n < with the possibility of n oo.A strategy a in (4.3) denotes a sequence that consists of elements st.Of course, using the reward function "y(., .),we can construct classes of optimization problems in a way similar to what we have done with respect to the loss function in Section 1.For example, we can consider the limit Markov control problem Je(2, st)--+ max, st E Ur, ( and H, a U T. We note that the definition of the matrix P( in (4.3) and the quantity E('y0, 2) in the problem (4.4), (4.5) eventually depends on our definition of the first pair (So, Xo) in the informa- tional stream (1.6), which may be given only approximately.Hence, it is reasonable to assume that the transition law matrix P has Markovian structure under specified n if the exact equality in (4.2) holds.To put it differently, for any finite n the structure of P depends on the topological struc- ture of sets X and T, thus when X and T are specified such dependency remains in force even if n-+oc.In the general case, it precludes the definition of the matrix P as a fixed finite dimensional matrix with the probability [16].As a result, stability analysis of the associated optimization models requires consideration of a family of matrices P under a specified level of error.Recall that a similar situation holds when dynamic rules are given.Then, we need the whole set X under a specified level of error to perform analysis of stability.Without such a "relaxation" of probabilistic requirements on the initial conditions of the model, for any arbitrary small e >0 an example of practical instability can always be constructed.
Deterministic Finite-State Finite-Memory Machines Now let us consider another type of algorithmic machine.Deterministic finite-state machines in the case of finite memory are defined as the triple [42] [) where Y]m is a finite set of machine states, andf is a mapping s (R) ]m Y]m which defines the machine- next-state function.The set Hs is a finite set of system states.More precisely, we assume that H can be formalized as a sequence (1.1) as a result of observations, computations, measurements etc.
This sequence "feeds" the machine (4.6).The mapping f2 :Hm -+ Uv defines the output function with a set of strategies Uv.Hence, starting from the state 30 c Hm, the machine (4.6) produces strategies (s,s2,...) while going through a sequence of its states (,2, ...) according to the recursive rules t f(xt-,t-), st --fz(t). (4.7) Excluding the current state of the machine t from (4.7), we find a function of strategies as a second degree recursion of the sequence (xt-, t-) st f2(fl(Xt-l,t-1)).( Hence, having knowledge of the previous state of the machine and a corresponding letter of the alphabet H, we can define the current strategy using the recursive function (4.8).This model does not require any formal association with a statistical model, and does not even assume the existence of the latter [42].The informational data stream produced by such machine is ((x0, 0), s, (xl, ),...).(4.9) From (4.9) we conclude that the starting informa- tion to compute the first strategy is a pair (x0,0).
We also observe that the main drawback of such a deterministic model is the requirement to fix the strategy immediately when the state of the machine D is given.Loosely speaking, some relaxation time between the transition tt should be incor- porated into the model to allow strategy correc- tion.Indeed, such time is implemented into probabilistic finite-state finite-action machines by probabilities of the transition from one state of the system to another under certain actions of a controller or DM.However, if we know a priori that P(t--tlxt-1, (st, xt)) 1, (4.10)   or time for such a transition is defined by a given time-interval, then the sequential decision scheme based on deterministic finite-state finite-memory machines is quite natural.If such information is not available a priori, then probabilistic finite-state finite-action machines appear to be useful in the analysis of system dynamics.
In the next sections we develop a technique to find a reasonable compromise between the two approaches described above.

THE PERTURBATION PARAMETER AS A FUZZY BORDER BETWEEN DETERMINISTIC AND PROBABILISTIC DESCRIPTIONS OF SYSTEM DYNAMICS
Major complexity in the mathematical modelling of dynamic systems arise from the a priori unknown character of the disturbance law.On one hand, the implicit assumption of deterministic models on the existence of an associated optimal algorithm (like an assumption (4.10)) can be hardly justified in modelling complex processes and phenomena.On the other hand, the main difficulty in effective applications of probabilistic models arises from the question of how common is the ergodicity of the Hamiltonian flow on the energy surface [24].As was pointed out, perturbations can qualitatively change the ergodic structure of the underlying dynamic system.The examples of Markov Chains with discontinuities in the sta- tionary distribution of the perturbed system can be found, for example, in [52,1].Furthermore, for any decomposition of such a chain into a finite number of independent ergodic subclasses (under the assumption e 0 +) examples of system instability can be constructed for arbitrary small e.
5.1 Degree of Recurrence in Mathematical Models for Evolution An idealization of "unperturbed" mathematical models obtained in the limit of vanishing perturbations e-0 + can often help to better understand real-world phenomena and processes.However, it should be realized that such an idealization has limited applicability, and depends on quite restric- tive mathematical assumptions related to homogeneity of the environment of the system, and uniformity of density which characterizes the system or its parts.
Since for any model of a dynamic system with specified dynamic rules the parameter of perturbation may be small but always positive, rescaling procedures for the associated (with the optimiza- tion model) Markov Chain may not provide an adequate approximation to the system dynamic.Such procedures may eventually ignore the neigh- borhood structure of the chain.If such a rescaling (for example the diffusion approximation) has been performed, then the original problem can be reformulated as an inverse problem with respect to a recursive function of density (1.10).The complexity of the solution of the inverse problem is determined by the degree of recurrence n and the topology of the space where investigation is being conducted.Moreover, if the topology is a priori specified then the regularity assumptions on the function fn allow us to recover the information on the regularity of the function , at least in principle for any arbitrarily big n, following certain logical rules.In the models like (3.8),(3.9)and (3.21), (3.22), fn plays the role of the Hamiltonian func- tion.Such models can be regarded as discrete optimization problems if we interpret the function f as one that defines the top-level goal, whereas all functions j, i= n-1,..., are supposed to define certain subgoals.The definition of the density function provides constraints for such a problem of multicriteria optimization.From the physical point of view such problems require finding the minimum of the Hamiltonian of the system on the energy surface, and can be formulated as follows: given a finite (typically large) number n of subsystems of a big system, minimize an approx- imation to the system Hamiltonian on an approximating set of its energy surface.Now recall the definition of system entropy in statistical physics as a quantity that is uncertain to an additive constant and is dependent on the choice of units, defined by the Liouville measure [36] (7 Jf log{(27rh)sf dp dq. (5.1)Here s is the degree of system freedom, p and q are momentum and position variables.If we assume that the whole system entropy can be defined through the entropies of its subsystems as icri, then for any probability distribution p-- (Pl,PZ,...,Pn) its associated information can be defined as the Shannon entropy [53,49]: The constant n in (5.2) can be approximated with respect to the required accuracy e and is ultimately coupled to the definition of s in (5.1).In the limit of "vanishing perturbations" e -+ 0 + and "maximum knowledge" n-oc, the Shannon entropy can be generalized to the continuous case of the Boltzmann-Gibbs entropy.The latter transforma- tion requires a justification of system stability.
From the physical perspective mathematical ide- alization of two simultaneous limits n-oc and e40 + requires an estimation of the degree of system freedom in the definition (5.1).In this sense such an idealization is problem specific, and always requires analysis of the measure stability.

Discrete Optimization and Evolution
of Thermodynamic Systems Any specific algorithm for the solution of the problem of modelling dynamic system evolution is affected by the form of the function fn (as a Hamiltonian approximation on the energy surface) and by the neighbourhood structure of the system evolution.In this sense an algorithm is always coupled to the problem specific information.In discrete optimization such algorithms can be conditionally divided into three main categories [10,50]: constructive algorithms (CAs) that require con- struction of decreasing and embedded in each other subsets of a given finite set of states , sequential algorithms (SAs) that attempt to construct a path through , and evolutionary algorithms (EAs) that manipulate sets of solutions in 52.
Let us assume that, for any given state xt from 52 that characterizes the whole system, there is a neighbouring set of states Nx, where transitions from xt are allowed.Then CAs usually apply a "greedy" policy when starting from x0 E 52, they choose at stage n an xn +1 such that E(Xn+l) min{$(t): Nx,}, ( where ,5' is an energy functional.Mathematically speaking, we expect that given and an accuracy e > 0, we can find a solution, at least in principle, when n-+ oo.However, it is well-known that as a result of such policy CAs may relatively easily be trapped in a local minimum of $.If is assumed to be continuous and E is a "reach" enough set, then in general the degree of recursion in (1.1 0) tends to infinity and we theoretically face infinitely many optimization problems (5.3).By now it is clear that without an appropriate analysis of the structure Nx,, success of such algorithms cannot be guaran- teed.As we pointed out earlier, such analysis has to be conducted with respect to given e.
The main advantage of SAs is based on the fact that they do not exclude the theoretical possibility of occasional acceptance of new states that may increase the energy functional [43].We also assume that an "initial" solution x0 52 may be given (for example, obtained by a CA).Moving to a neighbouring solution x' 52, the structure of the neighbourhood of the solution should be carefully analysed to avoid the difficulty of CAs.The basic idea for such an analysis came from statistical physics.The growing complexity of the solution of deterministic equations of motion for a system of many subsystems (such as particles) has led to the idea of ensemble averaging instead of classic- mechanical averaging in time.As the number of subsystems increases dramatically, the Monte-Carlo and particle-type simulations [27] eventually remain the only algorithmic procedures that can be applied in theoretical generality.However, such procedures may encounter serious difficulties in non-equilibrium thermodynamics [48].In a search for alternative approaches to the ensemble aver- aging, many useful ideas have been generated during recent years.The intrinsic ability of Markov Chains to form a canonical Gibbs ensemble numerically has led to growing interest in the subject [19,35].Using the principles of statistical physics we can assign to each state xt 52  where f(xt)=(xt)/k.The quantity (xt) can be interpreted as the potential energy of each state (or subsystem) in phase space that belongs to an ensemble.The probability that a system belongs to the ensemble is proportional to exp[-/(kT)] where k is the Boltzmann constant.We observe that the smaller T> 0 is, the more evident is the tendency of the Gibbs distribution defined by (5.4) to be concentrated on states xt with small values of f(xt).Hence, if we could simulate the cooling of the There are classic examples of SAs like the steepest-descent method that have potentially the same problems as CAs.
system, a state of minimum energy may, in principle, be obtained provided that the Markov Chain converges (in distribution) to the Gibbs distribution (stationary) law.This allows us to consider CAs as a partial case of this general interpretation when a Markov Chain is run for T 0 +.Another extreme case of the "high T limit" ultimately leads to the idea of dynamic continuity.
In such a case all states are assigned the same probability, and evolution is thought as moving from a state to its neighbours uniformly.The computational implementation of the above idea is provided by the simulated annealing algorithm first proposed in [31].For a real physical system, temperature may be lowered too rapidly, and the system may be trapped in a local energy minimum.
However, the choice of Tn c/logn with a suffi- ciently large c can theoretically guarantee the system's "escape" from the local minimum [21].In practice, the algorithm works as follows.If for the time-index n xt, is given, then from the set Nxt, we choose state t, calculate A f--f(t)-f(xt,), and set Xtn+ Xt with probability p exp(-A/Tn), with probability p, where A is A when A is positive and zero otherwise.
Of course, the choice of the neighbourhood structure is crucially important for the algorithm's performance.If the neighbourhood is chosen too small, then the resulting simulated Markov Chain may move very slowly around in the search of the minimum.On the other hand, if the neigh- bourhood is chosen too large, then the process eventually performs a "blind" random search throughout P. It samples randomly from a large portion of the state space, and every next possible state is chosen practically uniformly over the whole set .As an extreme case it may happen that Nxt-.T he conclusion which has to be drawn from the above consideration is that the choice of neighbourhood should be adapted to the approx- imation of the energy functional (or system Hamiltonian) in the search for a compromise between these two extremes.
The first step towards such an adaptation is realized in EAs.Typically, EAs deal with a population of solution instead of a single partial solution, as in CAs or SAs.The most important advantage of EAs consists of allowing an exchange of information between solutions in the current population (a cooperation step during the "generation cycle").The main problems for EAs are related to the self-adaptation step when the solution's internal structure may be changed with- out interaction with other members of the population.When there are a lot of replicates of the same solution in a population, EAs may converge prematurely, which is usually called a diversity crisis.In such situations EAs are not competitive with the best versions of SAs.
Let us summarise the definitions of strategies in the above three classes of discrete optimization algorithms: St "'1 (Xt-At, ) St "'2 (Xt-At, Nx,-At, ) for CA, for SA, for EA.
Here At > 0 is a relaxation time coupled to the algorithm performance when e > 0, and X, is a pop- ulation of solutions for the nth generating cycle.Functions Fi, i= 1,2, 3 are algorithm-specific.In general, they can be regarded as recursive functions of energy functionals, and the set of initial approximations X, for the specific algorithm: Fi fni(fni-l(... (fl(Xe, c) ...)). (5.6) At any specified moment of time t, the definition of strategy st implies a coupling rule between e and ni.The definition of such a coupling leads to the well-posedness of the problem.In this sense, the well-posedness of limiting models based on the assumptions e ec and ni oc is totally dependent on complete information about the initial conditions of the system, and a precise definition of the energy functional.The process of constructing mathematical models is always a competition between (i) an approximation of the system-environment boundary interface (which involves the system's internal time [44]), and (ii) the conservation laws for integral characteristics of the system (which involves mod- eler's time [39]).As a result ofsuch a competition, the resulting mathematical models simulate coupling of the system to its environment, and can be considered as models of neither isolated nor closed systems.A formal expression of the competition is provided by the physical concept of relaxation time.Having captured in the mathematical model the notion of information formally, its numerical expressions can be used in decision making with uncertainty, characterised by the adequacy of the simulation of the system-environment coupling.In general, numerico-logical methods can be used effectively only if an appropriate model has been constructed.
Hence, the quality of an algorithm depends deci- sively on an adequate reflection of the system- environment coupling in the mathematical model.If constructing a model is an art rather than a science, then the latter formally begins from the derivation of an algorithm from the model [56].
In concluding this section, it should be empha- sized that the quality of a mathematical model for dynamic system evolution is decisively dependent on (i) the approximation of the initial conditions for the system, and (ii) the approximation of the system-environment boundary interface.To mini- mize such dependency, the solution of a sequence of optimization problems can be used as an alternative to the limiting rescaling procedures approach.Such an approach seems to be more physically reasonable, since a priori information about the system can be given only as a certain possibilistic distribution which allows us to select a new distribution according to certain principles [15,49].

COUPLED MATHEMATICAL MODELS OF MACRO-AND MICRO-EVOLUTION
The complexity in identifying a "hard boundary" interaction between system and its environment is eventually determined by the degree of recurrence in the definition of the system Hamiltonian.Such a definition should be given with respect to the upper bound of error e in the identification of the set of initial conditions X. Since, in general, perturbed and unperturbed models might give rise to qualita- tively distinct types of descriptions of system behaviour for any arbitrary e > 0, the perturbation parameter alone cannot be an appropriate char- acteristic of the model's uncertainty.We observe that perturbations are an important part of the system dynamics which cannot be appropriately formalized in mathematical models unless we regard the mathematical modelling of dynamic system evolution as a decision making process with limited information from the very beginning of the modelling process.Additional information about the system becomes available in time at stages due to the model-associated computations, observa- tions and measurements.Hence, to approximate the dynamic system evolution, it is essential to take into consideration the fact that initial information about the system can only be given approximately.
A mathematical formalization of such approxima- tions is a challenging problem that requires new approaches.
On one hand, the idea of sequential approxima- tion and the hyperbolicity of the underlying differential equations is an intrinsic element of recent investigations in physics foundations [46,30].On the other hand, rescaling procedures allow us to construct mathematical models which are essen- tially parabolic by their nature.Moreover, the latter have proved to be a very useful tool for investigating the laws of nature.Although such rescaling procedures are always connected with the loss of some information, a justification of para- bolic approximations of dynamic system evolution may be obtained if we assume that there exists a system density f on the Gibbs phase space P such that its associated index of probability is given by log f.In general it allows us to consider the definition of entropy in the Gibbs form as H(f) fp r/(f)#e(dx,) (6.1)  instead of the definition (5.1),where is defined by (1.9).Such a formal identification of a (thermo)- dynamic system with a probability space is based on the Gibbs conjecture.Namely, we assume that the appropriate description of a macroscopic system in thermodynamic equilibrium may be provided by certain probability measures on the phase space of the system.Although this con- jecture has never been rigorously proved [24,39,40], the passage from (5.1) to (6.1) is not without certain gains.It provides a convenient framework for the development of a mathematical theory for dynamic systems allowing the formulation of the concept of ergodic theory that expresses at least some aspects of irreversible thermodynamic evolu- tion [45].However, the introduction of a recursion function using the Lebesgue measure #e(dxt) does not answer the question of stability for a "projection" of the Liouville measure (for a system with a certain degree of freedom (5.1)) onto the energy surface using a sequence of the Gibbs measures that deal with microcanonical ensembles.
As we explained above, from the physical point of view we should approximate the system ttamilto- nian on the energy surface, which is also subject to an approximation.Hence, mathematically speak- ing, to rigorously justify models arising from application of the Gibbs conjecture, we should be able to construct both the forward-evolution model and its associate for the backward-evolution as we explained it in Section 3. Gibbs was the first who arrived at the concept of mixing, and who noticed that the very use of probabilities in the description of physical states implies a time asymmetry [45].In turn, the latter implies rever- sibility of distribution functions in a mathematical sense, as well as a forgetfulness property with respect to the initial conditions of the system in the flow of time.Such a reversible time-asymmetry in the mathematical theory of dynamic systems is in contrast with the irreversible character of evolution implied by the second law of thermodynamics and Eddington's time arrow.The complexity of the mathematical formalization of evolution irre- versibility was well understood by J. Gibbs, who wrote [22], it should not be forgotten when ensembles are chosen to illustrate the probabilities of events in the real world, that while the probabilities of subsequent events may often be deter- mined from the probabilities of prior events, it is rarely the case that probabilities of prior events can be determined from those of subsequent events, for we are rarely justified in excluding the considerations of the ante- cedent probability of the prior events.
Almost a century ago he clearly pinpointed that the main difficulty in a mathematical formalization of the backward evolution models lies in the complexity of a probabilistic description of the initial conditions for the dynamic system, even if the probability of a terminal event is assumed to be given a priori.At the same time he proposed an approach that allows the effective construction of a framework for a formal separation of the "observer" from the "modeller", and the system from its environment.Such a construction plays a resolving role in mathematical modelling and computational experiments.In fact, if the conjecture is accepted, the "modeller" (at least in principle) can perform a task in the "best" possible way, and the idea to exclude the "observer" from the intermediate process of computations (except at the very begin- ning and the very end of this process) becomes natural [60].Then the whole time-set of the evolu- tion of a dynamic system may be associated exclu- sively with the "modeller" as an "error-nulling" optimizing device.The existence of such a device depends on the existence of an error-free model of dynamic systems, that in turn eventually depends on the definition of a sequence of switching events or a time-partition, when the "modeller" may become the "observer" and vice versa.
Starting from this idea we can introduce the notion of a Generalized Dynamic System (GDS) where the decision maker (modeller/observer or problem solver) is considered as an intrinsic part of the model [39].The basic steps of such a model construction are as follows: first, we consider the mathematical model of a dynamic system en+l H(v,en), n O, 1,... (6.2) as a mapping that couples two space-time events of the system evolution by a function of the perturbed velocity v and the system's Hamiltonian or its approximation H.Then, we specify a sequence of events (e0, e,...) by temporal evolu- tion.In practice such a specification is always an approximation for both the probabilistic and deterministic approaches.We assume that the basic features of dynamic rules that govern a system can be appropriately described by a velocity function Vl.Furthermore, we allow the possibility of a "correction" of these dynamic rules by another dynamic which is specified by another velocity function v0.Formally, Vl can be seen as a higher, but a priori unknown, degree of recursion of the function v0.As a result, we arrive at the two coupled sequences (x0, xl,...) and (h0, hl,...).( 6.3) When n--oo and e+0 + we expect that the sequences (6.3) merge, producing events that can be characterized by the limit of the model (6.2).
Since neither the degree of recursion nor the level of perturbations are known a priori, we formalize the dynamics of the system by the two equations xt+l Hl(V,Xt), (6.4) h+l Ho(vo, h-), where H is an approximation to H and H0 is an operator for sequential corrections of such an approximation.If we assume that in principle system dynamics can be described with arbitrary accuracy, then the first equation of the system (6.4) in the long run should be practically independent of v0.Such a limiting case corresponds to viewing perturbations as a force, "continuously" external to the system.However, in general, both functions v0 and v are perturbation-dependent.Thus, the system (6.4) provides the possibility of looking at the coupling between the velocity of the perturbed system and perturbations of its environment.It is assumed that in general such coupling can be looked at in two different space-time frames of reference, macroscopic and microscopic.One possible direction in the development of the theory of dynamic systems was provided by the celebrated Gibbs conjecture which we mentioned above.This led naturally to the idea ofthe control of dynamics described adequately (for example, in the almost-everywhere sense) by thefirst equation of the system (6.4) or its consequences, some of which we have considered in previous sections.Under this approach mathematical formalization of the deci- sion rules need some a priori assumptions on the smoothness of the function (or functions) that provides (or provide) an approximation to the recursive function H.It is precisely these assump- tions which formally allow the use of the perturbation theory in the investigation of underlying dynamic problems.In this way we "localize" the problem of scale interactions into a perturbation parameter e which stores information about the complexity of the problem no matter how big the degree of recursion n really is.From this point of view it seems reasonable to look at the classical system of the theory of singular perturbations (like (1.3),(3.24),(3.25))as those that may be obtained as a partial case of (6.4) by some appropriate rescaling procedures.More precisely, if e is interpreted as a force, which is external to the system, then in the limit of e 0 + the classic models in the theory of singular perturbations may be regarded as an infinite-recursion decision rule.
In the general case, however, the model (6.4) provides an interpretation of perturbations as an intrinsic to-the-system force.In this case it is reasonable to assume that both functions v0 and v are dependent on e for any interval of time.
Moreover, since the only available a priori informa- tion on e is its positiveness, we need to introduce a mapping to describe the behaviour of e while the system evolves.To put it differently, in order to perform at least in principle an infinite-recursion procedure when e ---+ 0 + and n oc, we need some learning rules to be introduced into the model.In [39] it was shown that under quite general assump- tions the optimal control problem (1.3), (1.5), (3.2) is reducible to the hyperbolic-type equation (generalized energy equation): (1 + v) O# +--+f0 -0, v (6.5) that has a unique generalized solution (in the sense of an integral identity).The unknown function was assumed to be Lebesgue integrable, that is # E LI(Q), where Q is the space-time region of interest.In the general case this function is referred to as the decision maker function.The interpreta- tion of Eq. ( 6.5) as a partial case of the system (6.4) can be formally given as follows.We consider a mathematical model that consists of two parts: (i) an idealized equation for a phase point in the system's time (with a trajectory h(-)) associated with the centre of the system gravity, and (ii) the macro-model of dynamic system micro-evolution in the decision-maker time "external" to the system (in terms of the decision-maker function #)e.Such a model of a Generalized Dynamic System couples two different space-time scales with the perturbed velocity function v in its two different manifesta- tions, micro-velocity v0, and macro-velocity Vl: h, u), ou/o + o.
(6.6a) (6.6b) Hence, the model is constructed in such a way that both parts of the perturbed velocity functions v0 and v inherit their dependency on the decision- maker function.If two events (between which GDS evolution has to be studied) are specified, then a pair of functions (h(-),#(t,x)) gives the solution to the problem.An approximation of such events can be given using a probabilistic connection between the micro and macro levels of the system description in the form of the corn- plementarity principle (6.7) If the smaller velocity v0 is assumed, then the bigger # at the initial moment of time should be chosen.Hence, formally by (6.7), we postulate the existence of the system in a space-time of events with the probability at the initial moment to of absolute DM-time for any arbitrary small values of Vo.Since % may be given only approximately, any approximation that follows from (6.6), (6.7) enables us to identify such an approximation with a Perturbed GDS (PGDS).In the limit of vanishing perturbations (e0 +) the model (6.6), (6.7) (PGDS evolution) formally converts into the model for Unperturbed GDS (UGDS) evolution and merges with the model (6.2).There- fore, in principle the model (6.5) can be obtained from (6.6), (6.7) using (6.6(b)) as a corrector for Eq. ( 6.6(a)).Such a corrector induces the presence in Eq. ( 6.5) of the goal function f0.The main difficulty behind such a formal procedure is how to construct an appropriate corrector.From the probabilistic point of view this difficulty was dealt with by Gibbs.Of course, there do not exist two non-identical events (related to the present state of the system evolution, and its future or past behaviour) described by any mathematical model with the same probability exactly equal to 1.In reality, all constructions of mathematical models for dynamic system evolution start from a countable base in space-time of events of PGDS evolution.At the next step, we approximate (6.2), and this "fuzzifies" the deterministic concepts of evolution in the probabilistic descriptions of events.It should be noted, however, that a randomness of GDS evolution is induced by inherent approximations in the model construction and is not an independently established fact by itself.The lack of rigour in the description of a dynamic system by purely probabilistic models stems from the fact of such an approximation.On eWe started from the consideration of the equations/ v0(', h, #) and 2 -vl(t,x,#).
the other hand, the main difficulty in applications of deterministic models is in the construction of effective correctors to describe adequately dynamic rules.In both situations the success of modelling is defined by the quality of an algorithm, which should be derived from the model using the concept of system stability.

COMPUTATIONAL MODELS AS MARKOV CHAIN APPROXIMATIONS
As soon as dynamic rules (with or without control) define a model for system evolution as a function of time x[, such a function becomes subject to intrinsic uncertainty for arbitrary small intervals of time.This is a natural reflection of the approximate character of mathematical models which can be in principle characterized by the degree n of recursion for such a function with respect to the function of density.Since such a degree can be rarely given a priori, we can approach the problem solution by imposing an upper bound on e.It seems to be natural that in applications to the real world, mathematical models of dynamic systems have to be understood as perturbed rather than unper- turbed models.Of course, they will remain as such in the foreseeable future.In general, it precludes assumptions on the forgetfulness property for den- sity distributions, and as a result the Markovian property for the perturbed system dynamics x[.
Behind the complexity of the problem is the question of the system's stability.The idea which will be developed in what follows is to construct a Markov Chain approximation simultaneously with an approximation of the system (that depends on Markov Chain parameters) to guarantee its stabi- lity.Hence the Markov Chain shall play the role of a learning rule for the system under an approximation of the perturbed system's velocity by its approx- imation Vl in the macroscopic DM frame of reference.As a result of such a construction and the Markov theorem on the generalized law of big numbers, the pair of functions (h(-), #(t, x)), which describes the process of GDS evolution, shall possess the Markovian property.Furthermore, it is proposed to approximate this process by a pair of discrete functions (h h ,#n ), where is an associated (with the microscopic frame of refer- ence) Markov Chain state.
Let us consider the PGDS described by the form of the generalized energy equation (6.5) Ot ----Vl(t,x,#)x--fo(t,x,#) (7.1)The approximation of the initial condition for this model is specified in the DM-time scale as /z(x, t)lt=,0 (e), ( where e depends on the approximation of the function v0 in (6.7).Hence, formally, the model (7.1), (7.2) can be seen as a macro-model for GDS evolution.However, microscopic features of the dynamics g are taken into account by the possibility of coupling between the parameter of system perturbations e and the decision-maker function #.
In what follows, a technique which is based on the construction of a hybrid-type algorithm [10] for the solution of this problem will be developed.The main results concern the derivation of a learning heuristic procedure that combine the effective features of (5.5), (5.6).To simplify the derivation, I explain the main ideas in the one-dimensional case, denoting a characteristic length of the system as h and assuming that h << T-to.Let us consider the evolution of the system defined by the dynamic rules (7.1), (7.2) in a square region of the macro- scopic frame of reference G-{(x,t)" Xo<X<Xt, to<_t<Tx}, (7.3)where absolute DM-times of initial (tto) and terminal (Tx-T) events, as well as a position x-x0 of the system, are specified.If GDS Compared to random processes with Markov Chain parameters in the continuous absolute time in [13,19] and references therein.Induced by (i) an approximation of system-environment boundaries at to and (ii) corrections of the function Vl by v0.
evolution takes place in ( under a certain level of perturbations e > 0, then for this region the func- tion vl depends on the DM-function #.This depends on v0 being subject to approximation from the initial moment of DM-time.Hence, we shall approximate the function Vl with respect to our approximation of the function v0 in a recursive manner.First we introduce the discrete grid in the region (7.3): 03-rh { (Xi, tJ)" Xi+l Xi + hi, j+ j -+-, i--O,n-1, j--O,m-1, m-T}, (7.4)   and consider an elementary space-time cell ci-[xi, xi+] (R) [tJ, +1] c (.The nodes of the grid (7.4) connect events relevant to the system evolu- tion.We shall refer to the whole set of such events in 0 as a set of macroscopic events.Let J and J + be two moments of absolute time (defined by DM) that correspond to two subsequent macroscopic events ej, ej+ of system evolution.Since the process (xt,#t) is assumed to be Markovian, these events can be specified by two pairs of discrete functions e (fh, (Xi, J)), ej+l -rh X ()+, #( i+1, tJ+')), where fh= x and f+h x_ are states of the associated Markov Chainh.To preserve basic macroscopic features of the system, the values of jumps Af h j. fh of this chain should be subordinated to the corresponding approximation of system-environment bound- aries.For example, let the time spent to cover the characteristic length h of the system be -.Then, we formally express the idea of subordination in the definition which follows, where we consider the limiting case -0 of such a subordination.DEFINITION 7.1 Let ej (fh, (Xi tj)), ej+ (, #(xi+,,t+')) be two subsequent macro- scopic events of GDS evolution that happen with the probability 1.Then the GDS velocity function between the macroscopic events ej and ej.+ can be defined in an elementary space-time cell c 0. c G as E-hl(x, ) /Nfh v(t,x)lim ( --0 T The numerator under the limit in (7.5) is referred to as the velocity of the Markov Chain between two subsequent macroscopic events.
The definition of the velocity function as the most probable jump of the associated Markov Chain (the jump which minimizes the energy of the transition) gives a way to construct a stable approximation of the Hamiltonian of GDS evolu- tion.We relate the macroscopic behaviour of the system to its microscopic characteristics defined in an elementary space-time cell cij.As a result, in any such cell the GDS velocity defined by (7.5) is always greater than or equal to 1. Hence, if the process is approximated in ci, the Courant-Friedrichs-Lewy (CFL) stability condition [12]  0-_< h) is satisfied automatically, regardless of the actual values of the velocity function in cij.Remark 7.1 In the limiting case h0, Defini- tion 7.1 loses its meaning and a macroscopic system degenerates into a point.Mathematically, however, this situation is well-defined as n-+ (m _> n): which returns us to the model (6.2).
Although formally, definition (7.5) coincides with the ordinary definition of the velocity function under the assumption of continuity (an infinite number of microscopic events between ej. and ey + 1), the latter is subject to application only in the case when both of the following claims are justifiable: knowledge of the "exact" Hamiltonian; knowledge of the initial conditions with "infinite precision".
Neither of these two can be guaranteed even for a simplified dynamic motion [44,45].Whereas in the classical definition of the velocity function we relate microscopic points in the macroscopic frame of reference, (7.5) establishes a correspondence between two macroscopic events on the probabilistic basis of microscopic events between them.
Hence, the GDS velocity is a measure of changes which take place on the microscopic level with respect to the macroscopic behaviour of the system.
If we assume that such changes are vanishing, lim_0+ v 11, then we can expect (see (7.6)) that lim v(t,x)-11.( -0 + We call the mathematical idealization of evolution described by the model (6.2) with the limiting velocity defined by (7.7) an Infinite Length Unperturbed Markov Chain (ILUMC).The reality of perturbations (e > 0) implies an approximation v vl that leads to the computational idealization of an Infinite Length Perturbed Markov Chain (ILPMC).The approximate relationship lim v( n----oo reflects our endeavours to describe the evolution of PGDS.In general, mathematical modelling of GDS evolution according to (7.8) implies an approximation of the macroscopic velocity func- tion with respect to an inevitable approximation of the function of micro-velocity.Such an approximation can be seen as the choice of a countable base in a topological space that induces a transfor- mation from a space-time of events of PGDS to a discrete space-time of macroscopic events of this system evolution.This assumes a passage from the grid of macroscopic events coh defined by (7.4) to a new grid, nodes of which are computational models of these events defined by a topology base in the macroscopic frame of reference.
A consideration of the space-time as a causal discrete set was the subject of many publications (see, for example, [7,9] and references therein).
Recently some new theoretical results on dynamic system discretizations on lattices have been obtained [14].Below we formalize these ideas with respect to our models using the Markov Chain approach.First, the state space of the initial macroscopic event e0 has to be specified with respect to absolute time of the decision maker, and an approximation of the system-environment boundary at the initial moment of such absolute time.
In the case of a one-dimensional approximation we define this space in the macroscopic frame of reference as E(i;0) {xi, O, 1,2, 3,...,N; N-2n, n-V(Tto)/h}.(7.9) We assign to each state of .=. a particular prob- ability weight pi , which can be defined on the basis of the micro-velocity approximation with the property of decreasing probabilities _> p0 > p0 > p... >p0u>_0 (the theoretical limit of "infinite precision" is not excluded).Thus, to define the state space of a macroscopic event, we include a theoretical possibility of GDS evolution in each cell of the grid of macroscopic events.If hi-h, i-0, 1, n 1, then maxj -j _< h and the limiting case of equality leads to a consideration of a square grid a(m-n) which has the resolution to identify any macroscopic event relevant to system evolution in ( when n oc.This case implies h 0 (and as a consequence -0) when the state space of the initial macroscopic event defined according to (7.10) degenerates into a ray that indicates the loss of connection between absolute DM-time and relative time of the dynamic system.We can circumvent this problem of uncontrolled propaga- tion of initial uncertainty by a probabilistic descrip- tion of macroscopic states which are subject to conservation of the Markov condition on the basis of an appropriately constructed Markov Chain associated with GDS evolution.DEFINITION 7.2 A set of macroscopic events defined by a mapping co -=(i; j), where ,=,(i; j) { (xi, J), k, 2n k, j k, k O, n}, (7.10) is called the cone of macroscopic events of system evolution.
Remark 7.2 The formula (7.10) in Definition 7.2 is given for a "one point target in absolute DM- time" and can be generalized for any target set including a set of isolated points in the DM-time scale (t-T).This may be of the great importance for some optimal control problems.Our next step is an approximation of the macro- velocity function with respect to the micro-velocity using Definition 7.1.As a characteristic of the microscopic velocity function e>0 we use a numerical index Xn 0(7-+ h) defined in the macro- scopic frame of reference by probability weights of the neighbourhood states of an associated Markov Chain.DEFINITION 7.3 A Markov Chain h, n< oe is consistent with the Markov process (h(7-), #(t, x)) defined by the mathematical model of GDS evolu- tion (7.1), (7.2) if E hl(xi'j) A; h ]21 (Xi, j, ]AJ)7 -1-o(h -[-7-) (7.11)   and o(h + (7.12) hold.We refer to the condition (7.11) as the condition of local consistency, whereas (7.12) is referred to as the global consistency condition.Remark 7.3 The equalities (7.11), (7.12) imply the fact that the macroscopic properties of the system should not change dramatically in small (with respect to the whole evolution) DM-time-sets, although microscopic properties can vary signifi- cantly subject to the velocity function.Another way of putting it is that consistency conditions referring to the probabilistic microscopic level make explicit basic features of system evolution on the macroscopic level.The same role in physics is played by the second law of thermodynamics [45].
In general, even if in the reality of dynamic system evolution there exists a uniform movement of the microscopic frame of reference with respect to the macroscopic one with a velocity v,, and a linear dependency of the corresponding points (x, t) and (-, h), these facts can be established neither by mathemat- ical modelling nor by a measuring experiment.
However, the limiting case of our consideration (when h 0 and hence 7--0) implies that COVen hl(xi'uj) m; h 0 when n Of course, the infinite length Markov Chain is within the scope of the Markov theorem on the generalized law of big numbers.Therefore, if we construct a Finite Length Perturbed Markov Chain (FLPMC) with the properties (7.11) and (7.12), we can guarantee convergence of such an approximation to ILPMC in the probabilistic sense of Theorem 7.1 when the number of macroscopic states n-ec.The limit passages A, --Ahn (if n oe then) --+ A (if e -0 then) Ao illustrate schematically a connection between FLPMC, ILPMC and ILUMC.An approxima- tion error of FLPMC with respect to ILUMC is defined by which vanishes in the limit e-0 + and n---, oc.In this case the macro-velocity of the system coin- cides (see (7.7)) with the velocity of the associated ILUMC, and Any other cases assume a probabilistic description of physical states (see [44]) that can be associated with an appropriately constructed Markov Chain.It makes it necessary to transform the continuous space-time ofa macroscopic frame ofreference into the discrete space-time of macroscopic events of system evolution, that is to construct the cone of macroscopic events.The base of this cone is subject to the implementation of the complementarity principle (6.7), which acknowledges the fact of the system existence at the initial moment of DM-time with the probability k.We note that as an alternative approach there is the theoretical possi- bility to control possible changes of macro-velocity from the micro-level.In general, using an appro- priate approximation (that is valid for the macro- scopic level of system description), we can describe the event e0 in the two complementary forms organizing dynamic systems.DM in such cases can be associated with the "observer", and this approach can be formally regarded as the velocity- energy formulation of evolutionary problems.To combine both possibilities in such a specification of the event e0, computational models of dynamic system evolution should be derived.The main difficulty that immediately arises stems from the necessity of an approximation of the limit of ne(n) for any dynamic system which evolves in spacetime (n-+ oc) under the possibility of vanishing perturbations (e 0 +).The method proposed in this paper is based on such a construction of computational event-models in the cone of macro- scopic events that preserve the stability property of associated evolution.In general, such an approach permits the DM to switch from "observer" to "modeller" and vice versa whenever it is necessary.
To construct a stable approximation of the model (7.1), (7.2) the idea of the upwind discrete scheme with flux limiters [57] is used.Without loss of generality for the numerical procedure, we assume that j70 0, which reduces Eq. ( 7.1) to 6.6(b).First, let us introduce in the cone of macroscopic events (7.10) a floating grid: either position-and-DM formulation as (x0,1), or time-and-macro-velocity formulation as 0).
Cd rh i=k, 2n-k, j=k, k=0,n}, (7.13) Theoretically, we can combine both approaches by considering the problem in terms of macro- velocity and the DM-function that corresponds to the specification of the event e0 as (0,1).Such a consideration is typical for mathematical models in optimal control theory, where the decision maker plays the role of the "error-nulling" optimizing device of a modeller type.This approach can be regarded as the velocity-control formulation of evolutionary problems.An alternative considera- tion of initial conditions as (1,0) seems to be intrinsic to the investigation of biological self- where t -1 J-I + r./_l when j > 1, t o + r when j-1, and tj ' o when j-0. Provided all rj._l, j 1, n, r0 r are defined, the grid (7.13) generates a set of approximations to the macro- scopic events defined by E(i;j).Since for a particular DM-time t -1 an associated event depends only on the macroscopic event that corresponds to the t-l-moment of DM-time, the value of rj_l is subject to stability conditions for the system.Such conditions depend on the velocity of the system, which is approximated using an evolution-associated Markov Chain.Now if we When classical concept of continuous phase space trajectories can be formally applied.However, it does not give a way to specify the initial condition for the macro-model (7.1), (7.2). denote approximations to #-function and V on coh as d and v respectively, then the approximations o { (-/+)/ f ; < 0, Ox (d/1/2 d/-1/2)/h if vi > 0 The partial cases of v--0 (v + 0) and v+-0 (v--0) give the results T 1 + -(1 --")/1) 2 0, /1 0, ")/2 0, allow us to derive the discrete scheme + /;- (7.14)where v + max[v/, 0], vmax[-vi, 0] and d/;}-j + a/,(r/, d,.}-41 + V4(r-,, 4-41 + 4,%(r+'), <t-a + V44r.Here , 1,4 are flux limiters which are subject to definition with respect to the velocity func- tion approximation.The other notations are the common Then the discrete scheme (7.14) can be rewritten in the form e+l {1 [11 + 4{[-<]} + 4{-3}. (7.18)A verification of the sum of all coefficients near unknown function on the right hand side of (7.15) gives unity.Hence, provided nonnegativeness conditions are satisfied, we can associate these coef- ficients with transition probabilities of a Markov Chain.In fact, the conditions of nonnegativeness of probabilities are +( + z) + -4 o, -(zz 1) + +z 5 o.
Proof If a previous state of the Markov Chain was -x subject to control dj, then according to the assumption of Lemma This equality together with definition (7.11) com- pletes the proof. [v-(1 -")/4-[-")/3) v+(1 -t-")/1--")/2)] between two successive macroscopic events coincides with the velocity of the process when n-oc.For any finite value of n we have VMc > v which corre- sponds to the nonnegativeness of the covariance of the Markov Chain jump between these macro- scopic events.

TV2MC
gives the required equality (7.19), if we take into account (7.12).Remark 7.5 For each cell ci# c coh a probabilistic analogue of the characteristics of Eq. (7.1) can be defined by the equality covf.hl(x"")Ah 4-7-vc const (7.20)To estimate the value of const in (7.20) we can eliminate the term o(7-+ h) in our approximation using (7.11) and (7.12): COl; hl(x'dj) mf h TVMC --TF.
Example 7.1 Examples of the choices of flux limiters are given below for two partial cases.
If v--0 and ')'2-0 (i =j) then the value of the flux limiter ')'1 can be found from (4.16) in the form v/8v + + + If v + =0 and "/3 0 (i N-j) then the value of the flux limiter 74 is defined as 74=1+ x/8v-+ + The identification of flux limiters completes the construction of the discrete scheme which defines the Markov Chain with the corresponding inter- polation interval -(subject to stability conditions) and transition probabilities.We state the result in the form of the theorem on the Markov-Chain- approximation stability in discrete space-time of events.THEOREM 7.2 If transition probabilities of a Markov Chain (h, n < oe) are defined by the formula Remark 7.7 (on numerical procedures).A numer- ical method proposed in this section is an explicit (evolution forward) stabilization procedure where the DM-function is a stabilizing factor subject to the velocity of the system.Remark 7.8 (on backward evolution operators and continuity of phase space trajectories).A prob- abilistic description of event en0 precludes the situation where terminating data for backward evolution procedures can be specified in a "deter- ministic" way.Moreover, states X(to) and x(T) of the system in DM-absolute-time scale can be characterised by different probability weights, which makes the continuity assumption for the connecting trajectory inapplicable in general.ph Vj= 0, n-1 and i=j,N-j (72=0 for i=j and "'3---0 for N-j), whereas the interpolation inter- valsatisfies the conditions (7.16), (7.17), (7.23), then the Markov-Chain-approximation of the pro- cess (h(-), #(x, t)) is stable, and discrete values of the DM-function can be foundfrom the formula d(x j+ tJ+ j+l d(XJk, lj d(XJk, .25)k Remark 7.6 (on convergence).When n oc the velocity of the Markov-Chain converges to the velocity of the process in the sense of Theorem 7.1.If we consider, for example, a formulation of the problem in terms of velocity-control, then due to the complementarity principle the discrete function (7.25) converges to the decision maker function of the system.

COMPUTATIONAL ASPECTS OF DISCRETE MARKOV DECISION PROCESSES
In a vicinity of any event e0 which we might conditionally associate with the present of GDS evolution, there are infinitely many events relevant to the GDS evolution which might be called past and future events of evolution.As a result, an event itself can be formalized mathematically, neither with a deterministic certainty, nor with a precise probability.This implies difficulty in justifying the separability of topological spaces when the evolu- tion of UGDS and PGDS is investigated.
Let us denote a probabilistic error of the inevitable approximation of such an event in the initial conditions of a mathematical model as p E (0, 1], e > 0. Then the principal mathematical assumption which allows us to develop analytical theory of dynamic system in continuous (space-) time is a possibilistic assumption of vanishing error lim ) 0. (8.1) e--+O + Moreover, a concept of absolute or "external" to the system DM-time [45,38] leads to the theoretical possibility of predicting a future event en which is associated with the DM-time t=T (possibly T= oc) with the probability 1.This means that lim 9,0 0, ( where on0 is a probabilistic error in the definition of this event.This approach (which is deterministic in its essence) usually visualizes evolution as a continuous trajectory x(t) between present e0 (time t= to) and future en0 (time t= T) events along which positions of the system can be determined at least in principle with the probability 1.Assuming that (8.2) holds, let us try to go backward in continuous DM-time.If evolution of the system in continuous space-time has taken place at all, we can select between events en0 and e0 at least (no 1) events relevant to system evolution, which we will refer to as macroscopic.Further, we can extract between macroscopic events el and e0 at least (nl-1) events relevant to system evolution which we will call microscopic, and will denote as e e2 O1 e O1 In the same way, we can find nl--l" (n2-1) sub-microscopic events el 1 e2 11 e TM n2--1 etc.As a result, we obtain a functional of the event- transition-error in the form n0-1 n-1 F(x, t) Z 0i+1 (0i) --Z 0/+1'0(0/'0) i=1 i=1 n2-1 -+-Z 0i+1'00(0i'00)-I-- where, for example, a probabilistic error in a transi- tion between events e/TM and 0l (i- na 1)   e:i+ is defined by oi+ ,oo(i,oo).where, assuming that (8.1) holds, we also have lim g, O. k-x Applying the same arguments in the forward DMtime we can draw the conclusion that for any "middle" macroscopic event em E (eo, eno) (DM- time tm (to, T)) both events e0 (DM-time t= to) and eno (DM-time T) are infinitely farfrom it in the continuous (space-) time of events.However, in the macroscopic frame of reference, the distance between the events em and e0 as well as a distance between em and en0 are well-defined in terms of absolute DM-time by the intervals A0, tm-to and Am,no T-tm respectively.In other words, provided that both assumptions (8.1) and (8.2) can be justified, any event em of GDS evolution has two time-characteristics: (absolute macroscopic) DM-time tm (to, T) and (relative microscopic) system-time T (--OO,---OO).The mathematical formalism, that allows us to circumvent the arising difficulty of time scaling, is based on the Cauchytype models, and requires an exact specification of initial (or terminal) conditions for the position- vector or the density function in a separable topological space.Eventually, mathematically rigorous justification of such models requires simultaneous application of the concept of a time-infinity (either in the form of ergodic-type hypotheses or infinite-step algorithm) and the possibility of vanishing perturbations when time goes by.Another way of putting it is that infinite time is a necessary condition for the justification of unperturbed mathematical models.However, sufficiency of this condition is subject to possibility theory [15,49] rather than the theory of probabil- ities.From the physical point of view the analysis of the described problem requires the concept of relative time.The mathematical idealization which reconciles the concepts of absolute and relative time of dynamic system evolution is ILUMC in the continuous space-time of events, for which the claim of (t, -) (-oc, + oc) is natural.The very next step in the modelling of dynamic system evolution is ILPMC.Such models imply an approximation of an event e0 that formally gives two rays in relative-time directions ((-oc, -0) and (-0, + oe)).Our knowledge of the relative time % is based on its intermediate influence on the quality of approximations of objects of mathematical modelling with respect to the moment to of absolute time.A selection of one of the two rays in relative-time directions corresponds to the choice of a Markov semigroup [44] associated either with a covariance-non-negative (for future) or a covariance-non-positive (for past) Markov Chain.Whatever model is chosen, the Markovian property for the evolution should be preserved by an appropriate algorithm.It requires consideration of perturbed mathematical models with the specified level of error.
An approximation of event e0 implies a trunca- tion of the series F(x, t) in (8.3).Let us denote a probabilistic error induced by such a truncation as 'd2L > 0.
Let us also assume that the limit of vanishing error, lim 1,9, -lim -0, l-oc 0 holds.Then, in general, the quality of prediction of GDS evolution by means of mathematical model- ling is defined by the quality of a solution of the optimization problem )i+l ()) min.(8.4) i=0 Since the difference between an unperturbed trajectory xt of ILUMC and a perturbed trajectory x[ of ILPMC at a certain moment t-tm of DM- time can be arbitrary big, the necessary condition for convergence of series (8.4), lim p+l (p) 0, cannot be guaranteed in general, no matter how small > 0 is assumed.This is not a surprising fact since in general the optimizing function is a function of an infinite degree of recursion of the density function.The intrinsic idea in mathematical modelling and computational experiments is to reduce the degree of recursion to a finite number.In doing so we arrive at the problem n0 i+ (i) ---' min, i=0 which implies the construction of FLPMC.Though the difference between two macroscopic states x and x+ in DM-time scale might still be arbitrary big in general (between two macroscopic events ez: and ek+l there might be an infinite number of microscopic events relevant to system evolution), we are now able to estimate a prob- ability of corresponding transition using the values of// /.BY means of FLPMC we preserve the stability of the macroscopic system (the object of mathematical modelling) with respect to its micro- scopic dynamics.Although stability of the micro- scopic dynamics with respect to a macroscopic system will follow in the limit of our construction any finite time computational procedure is not necessarily a reflection (even qualitatively) of the latter.To put it differently using tools of mathematical modelling results generated by ILUMC or ILPMC (i.e. a complete description of GDS evolution) cannot be guaranteed with the probability 1.If it is granted that mathematical modelling can give a way to describe the real processes, systems, and phenomena, then a con- ceptually necessary passage from continuous trajectories (x(t) or x*(t)) in absolute ("external" to the system) DM-time to a probabilistic description of physical states should be undertaken.A con- venient framework for a probabilistic description of system evolution from one macroscopic event to another provides the concept of DMDP [26].Since DMDP is considered in the macroscopic frame of reference, both a number of observed macroscopic events (which is finite no no(T, to, h0-))), and a topology of the state space, depend on an approximation of initial e0 and terminating en0 events.In the macroscopic frame of which can change in general with respect to absolute DM-time due to fluctuating system-environment boundaries.
(kl E [n,N], l= O, no), in the cone of macro- scopic events.In general the equality Pno-ln (kn0-1;kn0) cannot be guaranteed, and close- ness of this probability to depends on values of po(ko) and the structure of the cone of macroscopic events (i.e. on the approximation of e0 and en0).To single out amongst all probabilistic trajectories defined by DMDP (5.5) an optimal one we define the probabilities of successful prediction as /0(k) miaxp (i; i), i-no, No, and then p' (k; i) mxp (k; i), /(k;k) m.axp'(k0;i), i-n,N1.

CONCLUSION
In this paper mathematical modelling of dynamic system evolution has been studied as a problem in information theory.Computational models for evolution based on the ideas of evolution-asso- ciated Markov Chain approximations have been developed.Since the velocity function of the system is coupled to perturbations of its environment, stability conditions for the system have been derived in an explicit form.Mathematical models for the evolution of dynamic systems are closely connected with dis- crete optimization problems through the definition of information and the associated notion of entropy for thermodynamic systems.Information uncertainty in knowledge bases influences the construction of mathematical models, and should be taken into account.This implies a certain heuristic nature in such a construction.Such heuristic approaches are an important part of studying dynamic system evolution, and will remain as such in the foreseeable future, supplementing achievements obtained with the increasing computational power of modern computers and improved methods of data collection and ana- lysis.Moreover, hybrid procedures combining the features of constructive, sequential, and evolution- ary algorithms of discrete optimization give a general framework that could challenge well- established techniques in optimization theory.
Many important breakthroughs in optimization theory are intrinsically connected with the appli- cation of algorithms of sequential analysis that are based on the Markovian-type schemes.Such schemes are typical in computational models where minimax concepts of optimality are used.A mathematical formalization of the problem is quite natural, and is computationally consistent.The problem is viewed as attempts by the decision maker to obtain the best guaranteed result with respect to available information about the problem.The same formalization is a starting point for constructing mathematical models where other (such as probabilistic) concepts of optimality are used.In applications of such decision-maker schemes there is a natural contradiction between a desire for informational completeness in the model that is being constructed and a desire to namely lim f(f(t, xt)) and lim , Vt [0, T]. n--,'.ce--O + the probability exp(-f(xt)/T) (5.4) pT(Xt) Y]x,e.aexp(--f(xt)/T)'

Remark 7 . 4
The Markov Chain velocity VMC To guarantee conver- gence of the series in the right-hand side of (8.3) we