A Shortest-Path Lyapunov Approach for Forward Decision Processes

In previous work, attention was restricted to tracking the net using a backward method that knows the target point beforehand (Bellmans’s equation), this work tracks the state-space in a forward direction, and a natural form of termination is ensured by an equilibrium point p∗ (M(p∗) = S <∞ and p∗• = ∅). We consider dynamical systems governed by ordinary difference equations described by Petri nets. The trajectory over the net is calculated forward using a discrete Lyapunov-like function, considered as a distance function. Because a Lyapunov-like function is a solution to a difference equation, it is constructed to respect the constraints imposed by the system (a Euclidean metric does not consider these factors). As a result, we prove natural generalizations of the standard outcomes for the deterministic shortest-path problem and shortest-path game theory.


Introduction
The shortest-path problem (see [1][2][3]) plays a fundamental role in Petri nets theory, since it can be used to model processes.The analysis of these models can show useful information about the process.For example, deadlocks, equilibrium points, and so forth can be identified by computational analysis.
While it is possible to analyze such processes using the existing classical theory through the Bellman's equation with the cost criterion ( [4][5][6][7][8][9][10][11][12][13][14][15]), much of this theory has few disadvantages.Bellman's equation is expressed as a sum over the state of a trajectory needs to be solved backwards in time from the equilibrium point (target point).It results in an optimal function when it is governed by Bellman's principle, producing the shortest path needed to reach a known equilibrium point.Notice that the necessity to know the equilibrium point beforehand when applying the equation is a significant constrain, given that, in many practical situations, the state space of a Petri net is too large for an easy identification of the equilibrium point.Moreover, algorithms using Bellman's equation usually solve the problem in two phases [16]: preprocessing and search.In the preprocessing phase, the distance is usually calculated between each state and the equilibrium points (final states) of the problem, in a backward direction.Then, in the search phase, these results are employed to calculate the distance between each state and the equilibrium points, leading the search process to a forward search.
Tracking the state space in a forward direction allows the decision maker to avoid invalid states that occur in the space generated by a backward search.In most cases, the forward search gives the impression to be more useful than the backward search.The explanation is that in the backward direction, when the case of incomplete final states arises, invalid states appear causing problems.
Shortest-path problem [17,18] can be classified by two key categories [19]: (a) the single-source shortest-path problem where the goal is to find the shortest path from a given node to a target node (e.g., the algorithms of Dijkstra and Bellman-Ford); and (b) the all-pairs shortestpath problem is a similar problem in which the objective is International Journal of Computer Games Technology to determine the shortest path between every pair of nodes in the net (e.g., the algorithms of Floyd-Warshall and Johnson).
We are concerned about the first case.However, we consider dynamical systems governed by difference equations described by Petri nets.The trajectory over the net is calculated using a discrete Lyapunov-like function.A Lyapunovlike function is considered as a distance function denoting the length from the source place to the equilibrium point.This work is concerned with the analysis of the decision process where a natural form of termination is ensured by an equilibrium point.
Lyapunov-like functions can be used as forward trajectory-tracking functions.Each applied optimal action produces a monotonic progress towards an equilibrium point.Because it is a solution to the difference equation, naturally it will lead the system from the source place to the equilibrium point.
It is important to note that there exist areas of research using Petri nets as modeling tool where the use of a Lyapunov-like function is inherent.For instance, the "Entropy" function is a specific Lyapunov-like function used in Information Theory as a measure of the information disorder.The "free Gibbs energy function" is a Lyapunov-like function used in molecular biology for calculating the energy change in a metabolic network.
This paper introduces a modeling paradigm for shortestpath decision process representation in Petri nets theory.The main point of this paper is its ability to represent the characteristics related only with the global system behavior, and those characteristics related with the trajectory-tracking behavior.
Within the global system behavior properties, we show notions of stability.In this sense, we call equilibrium point to the place in a Petri net that its marking is bounded and it is the last place in the net (sink).
In the trajectory-tracking behavior properties framework, we define the trajectory function as a Lyapunovlike function.By an appropriate selection of the Lyapunovlike function, it is possible to optimize the trajectory.By optimizing the trajectory, we understand that it is the minimum trajectory-tracking value (in a certain sense).In addition, we use the notions of stability in the sense of Lyapunov to characterize the stability properties of the Petri net.The core idea of our approach uses a nonnegative trajectory function that converges in decreasing form to a (set of) final decision states.It is important to point out that the value of the trajectory function associated with the Petri net implicitly determines a set of policies, not just a single policy (in case of having several decisions states that could be reached).We call "optimum point" the best choice selected from a number of possible final decision places that may be reached (to select the optimum point, the decision process chooses the strategy that optimizes the trajectory-tracking value).
As a result, we show that the global system behavior properties and the trajectory-tracking behavior properties of equilibrium, stability, and optimum-point conditions meet under certain restrictions: if the Petri net is finite, then we have that a final decision place is an equilibrium point.
The paper is structured in the following manner.The next section discusses the motivation of the work.Section 3 presents the formulation of the decision model, and all the structural assumptions are introduced there, giving a detailed analysis of the equilibrium, stability, and optimumpoint conditions for the global system behavior properties and the trajectory tracking behavior parts of the Petri net.Section 4 presents the properties of the model.Finally, in Section 5 some concluding remarks are outlined.

Motivation
In this paper, we consider dynamical systems in which the time variable changes discretely, and the system is governed by ordinary difference equations.Let us consider systems of first-order difference equations given by where s i with i ∈ N are the state variable of the system, s 0 is the initial state, a i and i ∈ N are the action of the system, The system is specified by the state transition function f , which is always assumed as a one-to-one function for any fixed a and n ∈ N, continuous in all its arguments.
Lyapunov defined a scalar function L, called a Lyapunovlike function, inspired by a classical energy function, which has four important properties that are sufficient for establishing the domain of attraction of a stable equilibrium point: (a) ∃s * such that L(s * ) = 0; (b) L(s) > 0 for all s / = s * ; (c) L(s) → ∞ when s → ∞; and (d) ΔL = L(s i+1 ) − L(s i ) < 0 for all i, s i / = s * .The condition (a) requires the equilibrium point to have zero potential by means of a translation to the origin, (b) means that the Lyapunov-like function to be semipositive defined, (c) means that there is no s * reachable from some s, and (d) means that the Lyapunov-like function has a minimum at the equilibrium point.
The main idea of Lyapunov is attained in the following interpretation: given an isolated physical system, if the change of the energy E for every possible state s is negative, with the exception of the equilibrium point s * , then the energy will decrease until it finally reaches the minimum at s * .Intuitively, this concept of stability means that a system perturbed from its equilibrium point will always return to it.
A system is stable [20,21] if for a given set of initial states the state of the system ensures (i) to reach a given set of states and stay there perpetually or, (ii) to go to a given set of states infinitely often.The conventional notions of stability in the sense of Lyapunov and asymptotic stability can be used to characterize the stability properties of discrete event systems.An important advantage of the Lyapunov approach is that it does not require high-computational complexity but the difficulty lies in specifying the Lyapunov-like function for a given problem.
At this point, it is important to note that the Lyapunovlike function L is not unique, however the energy function of a system is only one of its kind.A system whose energy E decreases on the average, but not necessarily at each instance, is stable but E is not a Lyapunov-like function.Lyapunov-like functions [22] can be used as trajectorytracking functions and optimal cost-to-target functions.As a result of calculating a Lyapunov-like function, a discrete vector field can be built for tracking the actions over the net.Each applied optimal action produces a monotonic progress (of the optimal cost-to-target value) toward an equilibrium point.In this sense, if the function decreases with each action taken, then it approaches an infimum/minimum (that converges asymptotically or reaches a constant).
From what we have stated before, we can deduce the following geometric interpretation of distance [22]: (a) L(s) is a measure of the distance from the starting state s 0 to any state s in the state space (this is straightforward from the fact that ∃s * such that L(s * ) = 0 and L(s) > 0 for all s / = s * ); and (b) the distance from the stating state s 0 to any state s n in the state space decreases, when n → ∞.It is because L(s i+1 ) − L(s i ) < 0 for all i, s i / = s * .A Lyapunov-like function can be considered as a distance function denoting the length from the initial state to the equilibrium point.It is important to note that the Lyapunovlike function is constructed to respect the constraints imposed by the difference equation of the system.In contrast, a Euclidean metric does not take into account these factors.For that reason, the Lyapunov-like function offers a better understanding of the concept of the distance required to converge to an equilibrium point in a discrete dynamical system.
By applying the computed actions, a kind of discrete vector field can be imagined over the search graph.Each applied optimal action yields a reduction in the optimal cost-to-target value, until the equilibrium point is reached.Then, the cost-to-target values can be considered as a discrete Lyapunov function.
In our case, an optimal discrete problem, the cost-totarget values are calculated using a discrete Lyapunov-like function.Every time a discrete vector field of possible actions is calculated over the decision process.Each applied optimal action (selected via some "criteria") decreases the optimal value, ensuring that the optimal course of action is followed and establishing a preference relation.In this sense, the criteria change the asymptotic behavior of the Lyapunov-like function by an optimal trajectory-tracking value.
Usually, the criterion in optimization problems is related with the choice of whether to minimize or maximize the optimal action.If the problem is related with energy transformations, as is classically the case in control theory, then the criterion of minimization is applied.However, if the dilemma involves a reward, typical in game theory, then maximization is considered.In this work, we will arbitrary consider the criterion of minimization.
The Lyapunov-like function can be employed as a trajectory-tracking function through the use of an operator, which represents the criterion that selects the optimal action that forces the function to decrease and approaches an infimum/minimum.It forces the function to make a monotonic progress toward the equilibrium point.The Lyapunov-like function can be defined, for example, as which means that the optimal action is chosen to reach the infimum/minimum.The function L * works as a guide leading the system optimally from its initial state to the equilibrium point.
Example 1.To illustrate the shortest-path problem, let us consider a grid world (see Figure 1).At each time step, an agent is able to select an action among a finite set A of actions, for example, A = {Up, Down, Left, Right}.A transition model specifies how the world changes when an action is executed.An "equilibrium point" s * is a natural final state of the system.Therefore, the shortest-path problem is a search through the state space for an optimal path to the equilibrium point s * , using a deterministic transition model.The value of a state s is a number V (s) that intuitively speaking expresses the desirability of state s.For instance, let us consider the state-value function V being equal to the min function [23] as a specific Lyapunov-like function able to lead an agent to an equilibrium point in a grid world.
Example 2. The relative entropy or Kullback-Leibler [24,25] distance between two probability distributions q 1 i j|k and q 2 i j|k is defined as In the above definition, we use the convention (based on continuity arguments) that 0 log(0/q 2 i j|k ) = 0 and q 1 i j|k log(q 1 i j|k /0) = ∞.The relative entropy is always nonnegative and is zero if and only if q 1 i j|k = q 2 i j|k .V(q 1 , q 2 ) is a distance-like function between distributions since it is not symmetric and does not satisfy the triangle inequality.
Example 3. Glycolysis pathway (see Figure 2) is well known and described [11,26,27].It is a ten-step catabolic pathway that makes use of eleven different enzymes.The outcome are the conversion of glucose in two molecules of pyruvate with concurrent net production of 2 ATPs.Glycolysis process can be divided in two stages: (1) the conversion of glucose to glyceraldehyde 3-phosphate with a required input of 2 ATPs, International Journal of Computer Games Technology (2) the conversion of glyceraldehyde 3-phosphate to pyruvate with a net output of 4 ATPs.
Glycolysis can be informally explained from an energetic perspective as follows.The initial amount of glucose may be represented as a ball at the top of an irregular hill.Every time the ball bounces, the hill represents a reaction state in the breakdown of the sugar process.Each bounce of the ball corresponds to a change in free energy level.This energy change is modeled by the Gibbs energy function which is a Lyapunov-like function.It is important to note that bounces are irregular (reaching lower and higher energy levels) and determined by the environment conditions.The final state (pyruvate) is represented by the bottom of the hill where the ball reaches a steady state (not bounces).
Let us explain the Petri net dynamics of the system model as follows.Continuing with the ball and hill explanation, let us suppose that the ball, representing the product pyruvate, is at the bottom of the hill.And let us suppose that there is no net force able to move the ball either up or down the hill.That means that the reactions (forward and backward) are evenly balanced.Therefore, the substances and products are in equilibrium, and no net dynamics will take place.That is, "the metabolic network system is in equilibrium."

Formulation
We introduce the concept of decision process Petri nets (DPPNs) by locally randomizing the possible choices, for each individual place of the Petri net [23,28].
+ is a weight function, (v) M 0 : P → N is the initial marking, (vi) π: I → R + is a routing policy representing the probability of choosing a particular transition, such that for each p ∈ P, qj :(p,qj )∈I π((p, q j )) = 1, (vii) U: P → R + is a trajectory-tracking function.
We adopt the standard rules about representing nets as directed graphs, namely, places are represented as circles, transitions as rectangles, the flow relation by arcs, and markings are shown by placing tokens within circles [29].As usual, we will denote z• = {y|(z, y) ∈ F} and •z = {y|(y, z) ∈ F}, for all z ∈ I ∪ O.A source place is a place p 0 ∈ P such that •p 0 = ∅ (there are no incoming arcs into place p 0 ).A sink place is a place p f ∈ P such p f • = ∅ (there are no outgoing arcs from p f ).A net system is a pair Σ = (N, M 0 ) comprising a finite net N = (P, Q, F) and an initial marking M 0 .A transition q ∈ Q is enabled at a marking M, denoted by M[q , if for every p ∈ •q, we have that M(p) ≥ 1.Such a transition can be executed, leading to a marking M defined by M = M−•q+q•.We denote this by M[q M or M[ M .The set of reachable markings of Σ is the smallest (with respect to set inclusion) set [M 0 containing M 0 and such that if The previous behavior of the DPPN is described as follows.When a token reach a place, it is reserved for the firing of a given transition according to the routing policy determined by U. A transition q must fire as soon as all the places p 1 ∈ P contain enough tokens reserved for transition q.Once the transition fires, it consumes the corresponding tokens and immediately produces an amount of tokens in each subsequent place p 2 ∈ P. When π(ι) = 0 for ι ∈ I means that there are no outgoing arcs in the place-transitions Petri net (i.e., p ∈ ι is a sink).
In Figure 2, we have represented partial routing policies π that generate a transition from state p 1 to state p 2 , where p 1 , p 2 ∈ Pas follows .
Case 1.The probability that q 1 generates a transition from state p 1 to p 2 is 1/3.But, because q 1 transition to state p 2 has two arcs, the probability to generate a transition from state p 1 to p 2 is increased to 2/3.Case 2. We set by convention for the probability that q 1 generates a transition from state p 1 to p 2 is 1/3 (1/6 plus 1/6).However, because q 1 transition to state p 2 has only one arc, the probability to generate a transition from state p 1 to p 2 is decreased to 1/6.Case 3. Finally, we have the trivial case when there exists only one arc from p 1 to q 1 and from q 1 to p 2 .
It is important to note that, by definition, the trajectorytracking function U is employed only for establishing a trajectory tracking, working in a different execution level of that of the place-transitions Petri net.The trajectory-tracking function U in no way change the place-transitions Petri net evolution or performance.
U k (•) denotes the trajectory-tracking value at place p i ∈ P at time k and let U k = [U k (•), . . ., U k (•)] T denote the trajectory-tracking state of DDPN at time k.FN : F → R + is the number of arcs from place p to transition q (the number of arcs from transition q to place p).
Consider an arbitrary p i ∈ P and for each fixed transition q j ∈ Q that forms an output arc (q j , p i ) ∈ O, we look at all theprevious places p h of the place p i denoted by the list (set) •p ηij = {p h : h ∈ η i j }, where η i j = {h : (p h , q j ) ∈ I, (q j , p i ) ∈ O}, that materializes all the input arcs (p h , q j ) ∈ I and forms the sum where Ψ(p h , q j , p i ) = π(p h , q j ) * (FN(q j , p i )/FN(p h , q j )) and the index sequence j is the set κ = {j : q j ∈ (p h , q j ) ∩ (q j , p i ) & p h running over the set •p ηij }.
Remark 1. •p ηij denote the previous places to p i for a fixed transition q j ∈ Q. q q q q q q q q q q q q p q Continuing with all the q j 's, we form the vector indexed by the sequence j identified by ( j 0 , j 1 , . . ., j f ) as follows:

3-
Intuitively, vector (5) represents all the possible trajectories through the transitions q j s to a place p i for a fixed i, where j is represented by the sequence ( j 1 , j 2 , . . ., j f ) and f = #(κ).
Then, formally we define the trajectory-tracking function U as follows.
Definition 2. The trajectory-tracking function U with respect a decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} is represented by the following equation where the function L : D ⊆ R n + → R + is a Lyapunov-like function which optimizes the trajectory-tracking value through all possible transitions (i.e., through all the possible trajectories defined by the different q j s), D is the decision set formed by the j's; 0 ≤ j ≤ f , of all those possible transitions (q j p i ) ∈ O, Ψ(p h , q j , p i ) = π(p h , q j ) * (FN(q j , p i )/FN(p h , q j )), η i j is the index sequence of the list of previous places to p i through transition q j , p h (h ∈ η i j ) is a specific previous place of p i through transition q j .Example 4. OR-Path (see Figure 3).Define the Lyapunovlike function L in terms of the Entropy H(p i ) = −p i ln p i as L = min i=1,...,|α| (−α i ln α i ): Figure 3: (Left): routing policy case 1. (Right): routing policy case 2. Example 5. AND-Path (see Figure 4).Define the Lyapunovlike function L in terms of the Entropy H(p i ) = −p i ln p i as L = min i=1,...,|α| (−α i ln α i ): From the previous definition, we have the following remark.
Remark 2. (i) Note that the Lyapunov-like function L guarantees that the optimal course of action is followed (taking into account all the possible paths defined).In addition, the function L establishes a preference relation because, by definition, L is asymptotic; this condition gives to the decision maker the opportunity to select a path that optimizes the trajectory-tracking value.
(ii) The iteration over k for U is as follows: (1) for i = 0 and k = 0 the trajectory-tracking value is U 0 (p 0 ) at place p 0 and for the rest of the places p i the trajectory-tracking value is 0; (2) for i ≥ 0 and k > 0 the trajectory-tracking value is U qj k (p i ) at each place p i , and is computed by taking into account the trajectory-tracking value of the previous places p h for k and k − 1 (when needed).

Property 1. The continues function U(•) satisfies the following properties:
(1) ∃p Δ ∈ P such that (a) if there exists an infinite sequence (2) U(p) > 0 or U(p) > C, where C ∈ R, for all p ∈ P such that p / = p Δ ; (3) for all p i , p i−1 ∈ P such that From the previous property, we have the following remark.
Remark 3. In property 1 point 3, we state that ΔU = U(p i ) − U(p i−1 ) < 0 for determining the asymptotic condition of the Lyapunov-like function.However, it is easy to show that such property is convenient for deterministic systems.In Markov decision process, systems are necessary to include probabilistic decreasing asymptotic conditions to guarantee the asymptotic condition of the Lyapunov-like function.
Property 2. The trajectory-tracking function U : P → R + is a Lyapunov-like function.
Proof.Proof comes straightforward from the previous definitions.
Remark 4. From Properties 1 and 2, we have the following : Explanation.Intuitively, a Lyapunov-like function can be considered as trajectory-tracking function and optimal cost function.In our case, an optimal discrete problem, the costto-target values are calculated using a discrete Lyapunovlike function.Every time a discrete vector field of possible transitions is calculated over the decision process.Each applied optimal transition (selected via some "criterion," e.g., min(•)) decreases the optimal value, ensuring that the optimal course of action is followed and establishing a preference relation.In this sense, the criterion changes the asymptotic behavior of the Lyapunov-like function by an optimal trajectory-tracking value.It is important to note that the process finished when the equilibrium point is reached.This point determines a significant difference with Bellman's equation.
Example 6 (Conc-Path (see Figure 2)).Biochemical pathway of the free energy profile of the glycolysis and pentosephosphate.The following was adapted from Biochemistry Lehninger et al. [26] and Campbell and Farrel [30].The free energy changes were calculated using the steady-state metabolite concentrations in RBC's and the equation U = RT ln([Products]/[Reactants]).U = 0 was set arbitrarily at the end of the pathway after the pyruvate kinase step.The overall reaction for the pathway is shown in Figure 1.Because L : D ⊆ R n → R + , we will use the function min i=1,...,|α| (α i ∈ D) to select the proper element of the vector α ∈ D: A decision is taken and q b is selected instead of q k based in the environment condition modeled via the routing policy (1/3, 2/3).

Properties of the Model
We will identify the global system properties of the DPPN as those properties related with the PN.

Theorem 1. The decision process Petri net DDPN
Proof.Let us suppose that the DPPN is not finite.Then p * is never reached.Therefore, it is possible to evolve in time n and to reduce the trajectory function value over p * .However, the Lyapunov-like trajectory function converges to zero when n → ∞ (or reached a minimum), that is, U n = 0 or U n = C. Theorem 2. Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net bounded by a place p * .Then, a Lyapunov-like trajectory function can be constructed if and only if p * is reachable from s 0 .
Proof.(⇒) If U is a Lyapunov-like function then by the previous theorem p * is reachable.
(⇐) By induction, let us construct the optimal inverse path from p * to p 0 .At each discrete time n ∈ N in descending order (n is the maximum place index) the place of a system p n is observed and a transition q k ∈ Q leading to p n−1 is chosen.We choose the trajectory function U as the best choice set of states.We continue this process until p 0 is reached.Then, the trajectory function U is a Lyapunov-like function.
belongs to class K if it is strictly increasing and α(0 Let us consider [21] the vector function v(n, x(n)), v : + and let us define the variation of v relative to (8) by Then, we have the following results [20,21,31,32].
is a continuous function in the second argument.Suppose that γ(n, u) ≡ u+w(n, u) is nondecreasing in u, 0 < λ < Aare given and finally that α(λ) < β(A) is satisfied.Then, the stability properties of imply the corresponding stability properties of the system (8).
Proof.The stability properties are preserved for the following.
(2) Stable.Suppose that system (11) is stable, that is, for all u 0 by the comparison principle (which was implicitly proved in point 1) implies that v(n, x(n)) ≤ u(n) for all n ≥ n 0 .Taking δ equal to the one given from the continuity of v, |x(n, n 0 , x 0 )| < ε for n ≥ n 0 .If not, there would exist which cannot hold therefore, we must have that |x(n, n 0 , x 0 )| < ε for n ≥ n 0 as desired.
(3) Asymptotically stable.We know that system (8) is stable, the fact that it is asymptotically stable follows thanks to (4) Uniformly stable.Assume that the comparison system is uniformly stable, meaning that But δis independent of n.Therefore, the system (8) is uniformly stable.
We will extend the last theorem to the case of several Lyapunov functions.Let us consider a vector Lyapunov function v(n, x(n)), v : let us define the variation of v relative to (8).Then, we have the following theorem.

Theorem 4. Let
such that it satisfies the estimates: implies the corresponding practical stability properties of system (8).
(2) From the continuity of v with respect to the second argument, it is always possible to make v 0 (n 0 , x 0 ) We want to prove that |x(n, n 0 , x 0 )| < A for n ≥ n 0 .If it is not true, there exists an n 1 ≥ n 0 and a solution x(n, n 0 , x 0 ) such that |x(n 1 )| ≥ A and |x(n)| < A for n 0 ≤ n < n 1 .Then, we have that β(A) ≤ β(|x(n 1 )|) ≤ v 0 (n 1 , x(n 1 )) ≤ p i=1 u i (n 1 , n 0 , u 0 ) < β(A)!, which proves our claim.Remark 6.If in the point 1 of the proof it is not true that v(n, x(n)) ≤ e(n, n 0 , e 0 ) and v(n+1, x(n+1)) > e(n+1, n 0 , e 0 ), then we have that γ(n, e(n) which is a contradiction.
Then, we have the following result [21].
Example 7. The diamond is the stable form of carbon at extremely high pressures while the graphite is the stable form at normal atmospheric pressures.Regardless of that, diamonds appear stable at normal temperatures and pressures, but, in fact, are very slowly converting to graphite.Heat increases the rate of this transformation, but at normal temperatures the diamond is uniformly practically stable.
For Petri nets, we have the following results of stability [31].
Proposition 1.Let PN be a Petri net.Therefore, PN is uniform practical stable if there exists a Φ strictly positive m vector such that Moreover, N is uniform practical asymptotic stability if the following equation holds: Proof.Let us choose as our candidate Lyapunov function v(M) = M T Φ with Φ and m vector to be chosen.It is simple to verify that v satisfies all the conditions of Theorem 3. Therefore, the uniform practical asymptotic stability is obtained if there exists a strictly positive vector Φ such that equation ( 17) holds.
Proposition 2. Let PN be a Petri net.Therefore, PN is uniformly practically stable if there exists a Φ strictly positive m vector such that Proof.⇒) Since u T AΦ ≤ 0 holds, therefore for every u we have that AΦ ≤ 0. ⇐) This came from the fact that u is positive.
Remark 7. The if-and-only-if relationship of ( 19) exists from the fact that u is positive.
Definition 7.An equilibrium point with respect to a decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} is a place p * ∈ P such that M l (p * ) = S < ∞, for all l ≥ k, and p * is a sink.
Theorem 5.The decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} is uniformly practically stable iff there exists a Φ strictly positive m vector such that Δv = u T AΦ ≤ 0.
Proof.⇒) It follows directly from Propositions 1 and 2. ⇐) Let us suppose by contradiction that u T AΦ > 0 with Φ fixed.From which grows up without bound.Therefore, the DDPN is not uniformly practically stable.Remark 8.It is important to underline that the only places where the DPPN will be allowed to get blocked are those which correspond to equilibrium points.
We will identify the trajectory-tracking properties of the DPPN as those properties related with the trajectorytracking value at each place of the PN.In this sense, we will relate an optimum point the best possible performance International Journal of Computer Games Technology choice.Formally we will introduce the following definition [23].Definition 8.A final decision point p f ∈ P with respect to a decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} is a place p ∈ P where the infimum is asymptotically approached (or the minimum is attained), that is, U(p) = 0 or U(p) = C. Definition 9.An optimum point p Δ ∈ P with respect to a decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} is a final decision point p f ∈ P where the best choice is selected "according to some criteria."Property 3. Every decision process Petri net DDPN = {P, Q, F, W, M 0 , π, U} has a final decision point.
Remark 10.The monotonicity of U guarantees that it is possible to make the search starting from the decision points.
Then, we can conclude the following theorem.Theorem 6.Let DDPN = {P, Q, F, W, M 0 , π, U} be a finite decision process Petri net and let (p 0 , p 1 , . . ., p n ) be a realized trajectory which converges to p Δ such that ∃ Proof.Let us suppose that p Δ is never reached, then, p Δ is not a sink (the last place) in the decision process Petri net.So, it is possible to find some output transition to p Δ .Therefore, it is possible to reduce the trajectory function value over p Δ by at least .As a result, it is possible to obtain a lower value than C (that is a contradiction).Theorem 7. Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net.Then, U converges to an optimum (final) decision point p Δ (p f ).
Proof.We have to show that U converges to an optimum (final) decision point p Δ (p f ).By the previous theorem, the optimum decision point p Δ is reached in a time step bounded by O(U 0 / ), therefore U converges to p Δ .Proposition 3. Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net and let p Δ ∈ P be an optimum point.Then U(p Δ ) ≤ U(p), for all p ∈ P such that p≤ U p Δ .
Proof.We have that U(p Δ ) is equal to the minimum or the infimum.Therefore, U(p Δ ) ≤ U(p) for all p ∈ P such that p≤ U p Δ .

Theorem 8. The decision process Petri net DDPN
Then by the autonomous version of Theorem 4 and Corollary 1 the DPPN is stable.
(⇐) We want to show that the DPPN is practically stable, that is, given 0 < λ < A, we must show that |U(p i )| < A. We know that U(p 0 ) < λ and since U is nondecreasing, we have that |U(p Theorem 9. Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net.If p * ∈ P is an equilibrium point, then it is a final decision point. Proof.Let us suppose that p * is an equilibrium point, we want to show that its trajectory-tracking value has asymptotically approached an infimum (or reached a minimum).Since p * is an equilibrium point, by definition, it is bounded and it is a sink, for example, its marking can not be modified.But, this implies that the routing policy attached to the transition(s) that follows p * is 0 (in case there is such a transition(s), i.e., worst case).Therefore, its trajectorytracking value can not be modified and since the value is a decreasing function of p i , an infimum or a minimum is attained.Then, p * is a final decision point.
Theorem 10.Let DDPN = {P, Q, F, W, M 0 , π, U} be a finite and nonblocking decision process Petri net (unless p ∈ P is an equilibrium point).If p f ∈ P is a final decision point, then it is an equilibrium point.
Proof.If p f is a final decision point, since the DDPN is finite, there exists a k such that U k (p f ) = C.Let us suppose that p f is not an equilibrium point.
Case 1.Then, it is not bounded.So, it is possible to increment the marks of p f in the net.Therefore, it is possible to modify its trajectory-tracking value.As a result, it is possible to obtain a lower value than C. Case 2.Then, it is not bounded and it is not a sink.So, it is possible to fire some output transition to p f in such a way that its marking is modified.Therefore, it is possible to modify the trajectory-tracking value over p f .As a result, it is possible to obtain a lower trajectory-tracking value than C. Corollary 2. Let DDPN = {P, Q, F, W, M 0 , π, U} be a finite and nonblocking decision process Petri net (unless p ∈ P is an equilibrium point).Then, an optimum point p Δ ∈ P is an equilibrium point.
Proof.From the previous theorem, we know that a final decision point is an equilibrium point and since in particular p Δ is final decision point, then it is an equilibrium point.

Completeness
Theorem 11.Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net and let (p 0 , p 1 , . . ., p n ) be a realized trajectory which converges to p * such that ∃ i : |U i+1 − U i | > i (with i > 0).Let = min{ i }, then an optimum point p * is reached in a time step bounded by O(U 0 / ).
Proof.Let us suppose that p * is never reached, then p * is not the last place in the decision process Petri net.So, it is possible to find some output transition to p * .Therefore, it is possible to reduce the trajectory function value over p * by at least .As a result, it is possible to obtain a lower value than C (that is a contradiction).Remark 11.The complexity time O(U 0 / ) differs with that of the Dijkstra's algorithm.
Remark 12.Each path in DDPN corresponds to a trajectory of/in a given system.The trajectory-tracking function value of U at the source place (U mon 0 ) divided by = min{ i } equals the length of the shortest-path.Then, the infimum is equivalent to the infimum length over all paths in DDPN.
Theorem 12. Let DDPN = {P, Q, F, W, M 0 , π, U} be a decision process Petri net.Then, U converges to a point p * .Proof.We have to show that U converges to a point p * .By the previous theorem, the optimum point p * is reached in a time step bounded by O(U 0 / ), therefore U converges to p * .Proposition 4. The finite and nonblocking (unless p ∈ P is an equilibrium point) condition over the DDPN can not be relaxed.
Proof.(1) Let us suppose that the DDPN is not finite, that is, p is in a cycle, then the Lyapunov-like function converges when k → ∞, to zero, that is, L(p) = 0 but the DPPN has no final place therefore, it is not an equilibrium point.
(2) Let us suppose that the DDPN blocks at some place (not an equilibrium point) p ∈ P.Then, the Lyapunov-like function has a minimum at place p, lets say L(p) = C but p is not an equilibrium point, because it is not necessary to have a sink in the net.

Conclusions
In this work, a formal framework for shortest-path decision process problem representation has been presented.Whereas in previous work, attention was restricted to tracking the net using a utility function Bellman's equation, this work uses a Lyapunov-like function.In this sense, we are changing the traditional cost function by a trajectory-tracking function which is also an optimal cost-to-target function for tracking the net.This makes a significant difference in the conceptualization of the problem domain.The Lyapunov method introduces a new equilibrium and stability concept in decision process.

Figure 1 :
Figure 1: An illustrative example of finding the shortest path in a grid world.