Detection of Temporal Anomalies for Partially Observed Timed PNs

This article concerns faults detection and isolation for timed stochastic discrete event systems modeled with partially observed timed Petri nets. Events occur according to arbitrary probability density functions.Themodels include the sensors used tomeasure events and markings and also the temporal constraints to be satisfied by the system operations. These temporal constraints are defined according to tolerance intervals specified for each transition. A fault is an operation that ends too early or too late. The set of trajectories consistent with a given timed measured trajectory is first computed. Then, the probability that the temporal specifications are unsatisfied is estimated for any sequence of measurements and the probability that a temporal fault has occurred is obtained as a consequence.


Introduction
The prevention of faults is a critical issue in numerous systems to preserve the safety of both equipment and human operators.These issues have been addressed in numerous studies with fault detection and diagnosis (FDD) methods.The aim of fault detection is to create an alarm each time a fault occurs, and the aim of diagnosis is to isolate the fault within a group of candidates [1].In the domain of discrete event systems (DESs), FDD has been often formulated with automata, Petri nets (PNs) [2], in particular labeled PNs (LPNs) [3] or partially observed Petri nets (POPNs) [4].The main reason for developing FDD tests with PN extensions is that such models include graphical representations that can be disseminated widely in numerous application domains.They also offer mathematical supports that are consistent with standard tools.The proposed methods are useful for a large variety of technological systems, ranging from computer or chemical engineering to manufacturing and intelligent transportation systems.
In numerous contributions, the faults that are considered are unexpected events that may occur in event sequences and that cannot be directly measured.Various approaches have been proposed with PN extensions to detect and isolate such unexpected events.These approaches are based either on the analysis of the PN reachability graph [5][6][7][8][9], on the direct properties of the PNs [10,11], or on PN unfolding [12,13].A few results also concern the introduction of temporal information in the diagnosis process.At first, dates of events have been introduced in usual extensions of untimed PNs.Such dates lead to a more accurate estimation of the past and future fault occurrence probabilities [14] and are also useful to propose an evaluation of the unknown fault dates [15].The design and identification of models that include temporal faults have been also considered [16,17].Then, fuzzy Petri nets have been used to model and check temporal constraints between event occurrences [18].Partial orders with unfolding and (max, +)-linear inequalities have been used with timed PN models [19,20].Monotonic monitoring and stratification have been introduced, when the monitoring is fragmented because of the uncertain temporal observation [21].Finally, indirect monitoring has been used by comparing the actual cycle periods with the expected one in order to detect faults [22].
This paper takes place in the context where both transitions and places are assumed to be partially observed and consider only temporal faults.For that purpose, temporal constraints are defined by tolerance intervals that are associated with the transitions and that represent the normal durations of the system operations.The aim of the diagnosis system is to generate alarms when the temporal constraints are no longer satisfied.For that purpose, timed POPNs (POTPNs) are introduced.POTPNs take into consideration some measurable events that correspond to dated and labeled transition firings and also to partial measurements of the marking vector that is dated.This formalism, fully described in [23], is useful to represent incomplete history of dated measurements collected by SCADA systems.In the present work, this model is extended by adding temporal constraints that give upper and lower bounds for each transition duration.The paper is organized as follows.
In Section 2, temporal constraints and POTPNs are introduced.In Section 3, the main results are detailed.Examples are detailed throughout the paper.Section 4 concludes the paper.

Context and Notations
with (0) = .A marking  is said to be reachable from initial marking   if there exists a firing sequence  such that   [⟩.The set of all reachable markings from initial marking   is R(,   ).Timed Petri nets are PNs whose behavior is driven by time.Time is measured with time units (TU).The time can be associated with the firing of the transitions or with the sojourn of the tokens in the places.In this paper the time is associated with the transitions and the firing of each transition  occurs after a firing duration  that can eventually be zero.In that case the firing is immediate; in the other cases it is delayed.In this last case, the duration  can be deterministic ( is a constant) or stochastic ( is a random variable (RV)) with a probability density function (pdf) ().In this article, stochastic durations are concerned at first but the results are also applicable to deterministic durations.No particular assumption is made on the pdfs of the firing durations but the pdf of each transition is assumed to be known.The set of pdfs for all transitions is referred to as PDF.Two classes of pdfs are of particular interest for this work: bounded uniform (Figure 1(a)) and symmetrical triangular pdfs (Figure 1(b)) defined, respectively, with equations ( 2) and (3).Bounded uniform pdf is as follows: otherwise  () = 0. ( Symmetrical triangular pdf is as follows: Mathematical Problems in Engineering 3 otherwise  () = 0.
The motivations to consider these pdfs are that uniform random processes may be obtained as a limit case of (2) for  = 0 and  = ∞ and that (3) is useful to describe the dispersion of the duration around an average value.Note however that other pdfs can be considered.Note also that deterministic durations are obtained as the limit behavior of both pdfs when the support [, ] of the pdfs tends to zero.
The considered timed PNs have a time semantic [24,25] that is defined according to infinite server as server policy (Assumption A1), where each transition is considered as a server for firings and, in a given marking, each transition may fire simultaneously several times depending on its enabling degree; race as choice policy (Assumption A2), where the transition whose firing time elapses first is assumed to be the one that will fire next; and enabling memory as memory policy (Assumption A3), where, at the entrance in a marking, the remaining durations associated with still enabled transitions are kept and decremented and the remaining durations associated with disabled transitions are forgotten.Note that an interaction may exist between enabling memory and infinite server policies.At the entrance in a marking, the enabled degree of some transitions may decrease to a nonzero value.In the next, the considered nets are assumed to be 1-bounded and choice-free (Assumption A4); thus this situation never occurs and the reachability set of the net is of finite cardinality .Finally, firings are assumed not to be immediate and  > 0 (Assumption A5).
The nominal behavior of the timed Petri nets is also assumed to be constrained by temporal constraints.A time interval TC = [, Δ] is associated with each transition .The timed PN system satisfies the temporal constraints as long as the firing of the transitions has a duration not less than  or larger than Δ.The duration is measured from the transition enabling date to the date when it effectively fires.The set of all temporal constraints is referred to as TC.Such temporal constraints are useful to characterize the validity and the performance of the activities that are represented (e.g., operation of machines in manufacturing systems, transfer in automated transportation systems, and server load in communication systems).An activity with an exact duration  without any tolerance is represented by a transition  constrained by the time interval [, ] and an activity with no temporal specification is represented by a transition  constrained by the time interval [0, ∞].
A timed firing sequence  of length || =  fired at marking  in time interval [ 0 ,  end ] is defined as where  1 , . . .,   are the labels of the transitions and  1 , . . .,   represent the dates of the firings that satisfy  0 ≤  1 ≤  2 ≤ ⋅ ⋅ ⋅ ≤   ≤  end .This leads to the timed trajectory (, ) detailed in the following with ( 0 ) = : Note that we refer to timed and untimed firing sequences with the same notation  as long as the notation is not confusing; otherwise we use   to refer to untimed firing sequence and  to refer to timed ones.

Partially Observed Timed Petri Nets. Partially observed
Petri nets are considered to represent the system sensor. : T → E ∪ {} is a labeling function that assigns a label to each transition where E = {e 1 , . . ., e p } is the set of  labels that are assigned to observable transitions and  is the null label that is assigned to the silent ones.The concatenation of labels obviously satisfies the following:  =  and e k = e k .For simplicity, each label e k is represented by the elementary vector   of dimension  such that   = (  ) with   = 0 for  ̸ =  and   = 1.The null label is represented by the zero vector  = 0  of dimension .The labeling function is linear and defined by the matrix  = (  ) ∈ (N) × such that   = 1 if  ⋅ (  ) =   ; otherwise   = 0 ("⋅" stands for the product operator).The marking sensor matrix  ∈ (R)   × (R is the set of real numbers) defines the projection of the marking vector  over   subsets of places.The observable part of the marking is denoted as   =  ⋅ .
Thus, partially observed timed PNs with temporal constraints (POTPN) are defined as ⟨, , , PDF, TC,   ⟩ where PDF is the set of pdfs, TC is the set of temporal constraints,  is a Petri net structure,  is an event sensor matrix, and  is a marking sensor matrix.The matrices  and  define the sensor configuration.
Measurements are collected over the time interval [ 0 ,  end ].When the POTPN marking varies with the firing of a single transition  at date  ∈ [ 0 ,  end ], the measurement function Γ(,  0 ,  end ) is defined by Roughly speaking, the measurement function Γ collects a new label each time a transition fires that is not silent or that changes the measurement of the marking.The measurement function Γ is then extended to timed trajectories of the form (4) measured over the time interval [ 0 ,  end ]: The measurement function Γ collects  successive dated marking and event measurements of a timed trajectory (, ) of length ℎ over time interval [ 0 ,  end ] and organizes these measurements in a timed measured trajectory that is written as follows where   ( 0 ) = ,  is the length of the sequence that satisfies  ≤ ℎ, and  = {  ,  = 1, . . ., } refers to the set of measurement dates.Note that   ( 0 ) does not necessarily correspond to the measurement of initial marking   .A timed trajectory (, ) is said to be consistent with a given timed measured trajectory TR  in time interval [ 0 ,  end ] if it satisfies Γ((, ),  0 ,  end ) = TR  .In the next, it is assumed that the time interval starts at time 0 (i.e.,  0 = 0) and that it ends at the last measurement date (i.e.,  end =   ) (Assumption B).
The objective of the present work is to estimate the probability that any given timed measured trajectory satisfies the temporal constraints.An immediate application of the proposed estimation is to provide an algorithm that generates alarms when this probability goes down a specific threshold .To the best of our knowledge, it is the first time that this problem is considered with PNs.Note that POTPNs cannot be encoded as a Hidden Markov Model (HMM) [26] because in a HMM each state successively reached by the system delivers an observation that is not certain and depends on the emission probabilities.On the contrary, in a POTPN model, the states and the transitions deliver certain but partial observations and in some cases the states do not deliver any observation at all.

Temporal Specifications Checking
The proposed diagnosis systems operate with three stages.
(i) For any timed measured trajectory TR  with  measurements, the set of timed trajectories that are consistent with TR  are first computed with an integer linear programming approach developed in our previous work [23,27].
(ii) For each possible trajectory, the probability that this trajectory is consistent with the temporal constraints is estimated.
(iii) The probability that a timed measured trajectory is consistent with the temporal constraints is obtained as a consequence by computing the probability of each consistent trajectory [15].

Untimed Trajectories
Consistent with TR  .In this section, the set Γ −1 U (TR  ) of all untimed trajectories (  , ) that are consistent with a given measured trajectory TR  is computed.Note that this problem cannot be solved using standard algorithms (as the Viterbi algorithm) [28] issued from dynamic programming because such algorithms aim to find only the trajectory of maximum probability from the measured trajectory, but not all trajectories.For diagnosis issues it is however required to consider all trajectories.It is assumed that the set of the reachable states and also the reachability graph of the net are known.Let us define   as the matrix of the reachability graph of the unobservable part of the PN where all transitions that are observable or whose firing changes the measured part of the marking have been removed and assume that this graph is acyclic (Assumption C).In that case, the maximal number of consecutive silent events is upper bounded by ℎ max − 1 [23] with Let us consider a timed measured trajectory TR  of the form (7) with  measurements in time interval [0,   ].An untimed trajectory (  , ) with   = (1)(2) ⋅ ⋅ ⋅ () is consistent with TR  iff the following conditions are satisfied [27]: (1)  ⋅  =   (0); (2) There exists ℎ 0 = 0, ℎ 1 , . . ., ℎ  , such that ℎ  − ℎ −1 ≤ ℎ max ,  = 1, . . ., , and the untimed firing sequence   is rewritten as   = (1) ⋅ ⋅ ⋅ (ℎ 1 ) ⋅ ⋅ ⋅ (ℎ  ) and satisfies the following: . . .
) ) ) where all inequalities are taken component wise.Roughly speaking inequality (9) means that the firing count vector of each transition in  is positive, unitary, and feasible (i.e., leading to a positive marking).Equality (10) means that for  = 1, . . ., , ℎ  − 1 first transitions are silent and only the last one may provide a label   (  ).Similarly it ensures that ℎ  − 1 first marking measurement does not provide any information and only the last one may provide marking changes   (  ) −   ( −1 ).The combined use of ( 9) and ( 10) leads to the exhaustive set of untimed trajectories that are consistent with TR  [23,27].Note that Γ −1 U (TR  ) does not include the silent closure of the trajectories (i.e., the continuations of the trajectories that provide no event nor marking measurement) because the time interval [0,   ] ends with the last measurement (Assumption B) and the no immediate firing occurs (Assumption A5).If required, the silent closure can easily be added to Γ −1 U (TR  ) by considering the following equation in addition to ( 9) and ( 10): where (ℎ  + 1) ⋅ ⋅ ⋅ (ℎ +1 ) stands for the silent closure.Let us consider the marked POTPN1 of Figure 2 with The problem solved in this section is to evaluate the probability that the transitions of (, ) in ( 12) fire in specific intervals [ ℎ ,  ℎ + ], ℎ = 1, . . .,  knowing that the measurement dates are  and that the untimed trajectory (  , ) is the true one.This probability is noted prob({ ℎ , ℎ ∈ [1 : ]} | (  , ) and ).In the case where  ℎ =  ℎ −  ℎ  are independent RV, the following holds: The difficulty is that variables  ℎ are not necessarily independent RV.Table 1 details the different situations (type 1, 2, or 3) that may occur depending on whether the dates  ℎ and  ℎ  are measured or not.
Situations of type 1, 2, or 3 will be used in the next section to evaluate the probability that a timed trajectory satisfies the temporal constraints.

Probability That a Timed Trajectory Satisfies the Temporal
Constraints.The probability that any timed trajectory (, ), obtained from untimed trajectory (  , ) and consistent with TR  , satisfies the set of temporal constraints TC results from the extension and integration of ( 13) with respect to the 3 types of situations described in Table 1.This leads to the following: prob ((, ) satisfies TC | (  , ) and ) with Mathematical Problems Engineering where dates  ℎ and  ℎ and functions  ℎ are defined in  15) does not necessarily include all variables  ℎ .To simplify the notation let us also divide the transitions of type 2 in several classes referred to as Class(ℎ  ) formally defined by the following: is of type 2 and  ( ℎ ) In other words, Class(ℎ  ) is the set of transitions of type 2 (their firing date is measured) that are enabled at the same date  ℎ  .Roughly speaking,  1 (  , , TC, ) is a multi-integral that evaluates the sum of the duration variables  ℎ over their possible range of variation constrained by the measurement time interval [0,   ], the dates of measurements   , and the temporal specifications [ ℎ , Δ ℎ ].Similarly,  2 (  , , ) evaluates the sum of the duration variables  ℎ over their possible range of variation constrained only by the measurement time interval [0,   ] and the dates of measurements   (the temporal specifications [ ℎ , Δ ℎ ] are not considered).The ratio of both evaluations leads to the probability that any timed trajectory (, ), resulting from untimed trajectory (  , ) and consistent with , satisfies the set of temporal constraints TC.From a numerical point of view the calculation of  1 (  , , TC, ) and  2 (  , , ) is obtained with a recursive algorithm.

Detection of the Temporal Faults.
In the general case, several untimed trajectories (  , ) may be consistent with a given timed measured trajectory TR  in time interval [0,   ] (i.e., Γ −1 U (TR  ) contains more than one trajectory).This situation is due to two different reasons: (1) several markings  may be consistent with the first marking measurement   (0); (2) from a given marking , several untimed sequences may be consistent with the measured trajectory TR  .In order to deal with (1), let us consider  0 (TR  ) as the set of markings consistent with   (0) and  0 () as the probability that  is the current marking at date 0 such that ∑ ∈ 0 (TR  ) ( 0 ()) = 1.The set  0 (TR  ) and the probability  0 () for each  ∈  0 (TR  ) are assumed to be known (Assumption D).Note that Assumption D can be relaxed for PNs without absorbing subsets of markings by considering the steady state probability of  as  0 ().In order to deal with (2), the probability of each sequence issued from the same marking  and consistent with TR  is evaluated with the following: prob ((  , ) |   and ) Finally the probability of each trajectory (  , ) is obtained with Then, the diagnosis of temporal faults results from the iterative evaluation of (19).For that purpose, a sampling period Δ is considered and for each date  =  ⋅ Δ, prob(TR  satisfies TC) is updated depending on the eventual In case A, the pdfs of the transition durations are assumed to be bounded uniform with the same support [0, 10] for  1 and  2 .The temporal constraints are arbitrarily defined by TC 1 = [4,6] for  1 and TC 2 = [3,7] for  2 .Prob(TR  satisfies TC) computed with (19) is reported in Figure 3(a) (full line) in function of the date  1 .For the considered example, this equation can be rewritten as follows: Note that this probability is zero for  1 < 7 TU (one lower bound at least is not satisfied for the temporal constraints) and also for  1 > 13 TU (one upper bound at least is not satisfied for the temporal constraints).This computation is confirmed with a series of 1000 Monte Carlo simulations that coincide with TR  (dashed line).Depending on the choice of the threshold , an alarm may be generated.For example, for  = 0.2, an alarm is generated if  1 ∈ [8.5, 11.3].
In case B, the pdfs of the transition durations are assumed to be symmetrical triangular with supports [1,4] for  1 and [2,7] for  2 .The temporal constraints are, respectively, TC 1 = [2, 3] for  1 and TC 2 = [3,6] for  2 .Prob(TR  satisfies TC) is computed with (19) in Figure 3(b) (full line) in function of the date  1 .This computation is also validated with a series of 1000 Monte Carlo simulations that coincide with TR  (dashed line).

Numerical Complexity.
The numerical complexity of the whole diagnosis schema is due (a) to the computation of Γ −1 U (TR  ); (b) to the numerical evaluation of ( 19).
(a) The complexity to compute Γ −1 U (TR  ) is related to the resolution of ( 9) and (10) that include ℎ max ⋅  ⋅ ( +  + 1) inequalities and ℎ max ⋅  ⋅ (  + ) equalities with ℎ max ⋅  ⋅  +  unknown integer variables.Basically, the complexity is exponential with respect to the number  of reachable markings in R(,   ) and to the length  of the measured trajectories (ℎ max is a constant parameter).Branch and bound algorithms can be used to solve ( 9)- (10) as an integer linear programming problem (LPP) [24].These algorithms have a general nonpolynomial complexity but limit the computational effort in many practical situations.An algorithm of linear complexity has been also developed in our previous work that limits the length  of the timed measured trajectories under test.It considers measured trajectories within a sliding window of constant size  0 instead of increasing size  [25] and leads to an algorithm of linear complexity with respect to .Note also that the complexity with respect to  is no longer exponential if the set  0 (TR  ) is known (Assumption D).
(b) The numerical evaluation of ( 19) is obtained according to a recursive scheme with a deep equal to the number of transitions of type 2 or 3 in the considered sequence.Consequently the computation effort increases rapidly in time and in space with respect to the sequence length.To limit the computational complexity, the trajectory (, ) is divided into  subtrajectories that are considered successively and independently: (, ) = ( Trajectory decomposition is as follows:  4 ( 4 −  3 ) ⋅  5 ( 5 −  4 )  4 )  3 )  2 = 0.054.(22) This computation is validated with a series of simulations that leads to a probability of 0.47.The evaluation of Prob(TR  satisfies TC) with (19) saves time compared to the numerical evaluation based on simulation.

Conclusion
This article has proposed a diagnosis system that checks if the heterogeneous measurements obtained from a stochastic timed discrete event system with an uncomplete sensor configuration are consistent or not with a set of temporal constraints that specify tolerance intervals for the system operations.For this purpose, the set of trajectories consistent with a given timed measured trajectory are first characterized.Then the consistency of each trajectory with the temporal constraints is estimated as a probability.Finally the probability of each trajectory is also evaluated and the global probability that the temporal constraints are satisfied results from the previous steps.The diagnosis system returns an alarm each time this probability goes below a given threshold.The contribution is validated with simulation results.
In the future, we will consider the isolation of the temporal constraints that are unsatisfied.We will relax some assumptions, in particular Assumption B, in order to consider the silent closure of the trajectories.We will also study the problem from a structural point of view by providing some results to decide whether a set of sensors is suitable or not to check that a set of temporal constraints is satisfied.Finally we aim to apply the proposed approach to larger systems.

Figure 3 :
Figure 3: Computation and Monte Carlo simulations to evaluate prob(TR  satisfies TC) in function of  1 : bounded uniform pdfs (a); symmetrical triangular pdfs (b).