Switching systems: active mode recognition, identification of the switching law

The problem of the estimation of the dis rete state of a swit hing system is studied. The knowledge of the swit hing law is essential for this kind of system as it simpli(cid:28)es their manipulation for ontrol purposes. This paper investigates the use of a model-based diagnosis method for the determination of the a tive mode at ea h timepoint based on the system input/output data. The issue of the parametri identi(cid:28) ation of the swit hing law is also addressed.


INTRODUCTION
The modeling of complex systems often leads to complex nonlinear models.To get rid of the complexity of the obtained model, one often resorts to a widely used modeling strategy which represents the system behavior by using a set of models with a simple structure, each model is describing the behavior of the system in a particular operating zone.Within this modeling framework, hybrid models [1,2] are very successful in representing such processes.
Hybrid models characterize physical processes governed by continuous differential and difference equations and discrete variables.The process is described by several operating regimes called modes and the transition from one mode to another is governed by the evolution of internal variables (input, output, state) or external variables or events (action of a human operator on the system, e.g.).The global behavior obtained for the modelled complex system is strongly related to the nature of the procedure managing the transition from one mode to another.When this transition is abrupt, one obtains the class of switching models.This class of models is widely used because the well-mastered tools for analysis and control of linear systems can be extended, under some assumptions, to this class of systems, and moreover many real processes can be represented by models belonging to this class.
Research on switching systems is mainly focused on the fields of identification [3][4][5][6], control [1,7], stability analysis [7,8], and state estimation [9,10].The knowledge of the mode describing the evolution of the system at any moment, this mode being called active mode, is a crucial piece of information that simplifies the application of the various results coming from the fields of identification, control, stability analysis, and state estimation.Ackerson and Fu [9] were the first to consider the question of the determination of the active mode by stating the problem in the form of a state estimation problem in a noisy environment.The system noise is modelled by a set of Gaussian distributions, with different means and variances that influence the system one at a time, the transition from one noise source to another being determined by a Markov transition matrix.In [5,[11][12][13], the recognition of the active mode is carried out by the means of model-based diagnosis techniques.A methodology for the design of dynamic observers for hybrid systems is proposed in [10].The suggested observer consists of two parts: a location observer which is dedicated to the recognition of the active mode at any moment and a continuous observer which is devoted to the estimation of the continuous state, once the active mode is recovered.Several observability concepts were introduced in [14 -16].Depending on the knowledge of the mode sequence and on the variables to be recovered, several observability concepts are given and they are characterized through linear algebraic tests.
The recognition of the active mode is closely related to the "proximity" of the models describing each mode.It is obviously easier if the switching law is known.From there, one can see the importance of the identification of the switching law.
This paper addresses the issue of the active mode determination for a switching system, using only the system input/output data.To perform this task, a model-based diagnosis method [17] is extended to this class of systems.We also put forward a procedure for the identification of the switching law.The paper starts in Section 2.1 with a brief reminder on the modeling of switching systems.The recognition of the active mode is developed in Section 3. The proposed method rests on model-based diagnosis methods.Then, the conditions guaranteeing the discernability of the various modes are formulated.An enhancement to the method is carried out in order to take into account the presence of measurement noise.Section 4 is devoted to the identification of the switching law.The proposed method offers an interval approach for the estimation of the parameters of the switching law.An academic example is shown in Section 5

Modeling of switching systems
Let us consider the system represented by (1) as follows: (1) Equation ( 1) represents a switching system with s operating regimes or modes.The variables u(•), y(•), and x(•), respectively, stand for the input, the output, and the state of the system.The switches are introduced by means of the state matrix which takes its value in a finite set which is a priori known.This formulation does not restrict at all the modeling of the switching system and the results presented in this paper can be extended to the case where the matrices B and C also take different values.The variable μ (•) denotes the active mode at any moment.For example, if one has μ k = i, i ∈ {1, 2, . . ., s}, the system is said to be in the mode i at the instant k.
The evolution of the mode selection variable μ (•) can be described in a variety of ways.Here, we assume that μ k is given by where the switching law depends on the variable Each region H i is a convex polyhedron defined as In order to let the piecewise affine map defined by (3) be well posed, we allow some of the "≤" inequalities to be strict, meaning they can be replaced by "<" inequalities.
The variable ξ (•) can be external to the system and, in that case, the mode sequence is arbitrary and independent of the system variables (input, output, and state).The switches from one mode to another can also be triggered by internal variables as the state x(•) (piecewise affine systems [18]) or the input u(•) and the output y(•) (piecewise autoregressive exogenous systems [3]).We assume here that ξ k is defined by where It is worth noting that the definition of ε k in (4) does not limit the significance of the proposed contribution in this paper (especially in Section 4) as the proposed method remains applicable as long as ξ k can be estimated (case of piecewise affine systems) or measured.
Combining (1), (2), and (4), we retained model of (5) as a model for switching systems in the continuation of this paper: (5) The model of ( 5) is intended in this paper to represent a switching system that does not switch at every time point like it can be the case for static converter.Hence, the system is assumed to have a minimum dwell time in a mode after a switching instant.
Coming from (5), first, we wish to recover the active mode at any moment, using only the system input/output data on a finite observation window.If the system's modes are assumed to represent healthy operating modes as well as faulty operating modes, the active mode estimation task can be seen as a fault detection task.Once the active mode is recovered, the goal is to estimate, from a sufficiently rich sequence of input/output data, the parameters H i , i = 1, . . ., s, of the switching law, knowing its structure.

Definitions and notation
We introduce the following definitions.Definition 1.A path μ is a finite sequence of modes: The length of a path μ is denoted by |μ| and Θ h denotes the set of all paths of length |μ|.
μ [i, j] is the infix of the path μ between i and j: μ Definition 2. The observability matrix O μ,h of a path μ ∈ Θ h is defined as . . .
Definition 3. On a finite observation window [k − h, k], the active path μ * is the one describing the actual mode sequence on the observation window.
From Definitions 1 and 3, the estimation of the active mode at any moment is equivalent to the determination of the path describing the true mode sequence on a finite observation window.In order to achieve this, throughout the remainder of this paper, we will focus on the recovery of the active path on an observation window.

Detection of the active path
The active path determination task can be formulated as a recursive problem applied to a sliding window.On a time window [k − h, k], (5) can be written as where T μ,h is a Toeplitz matrix defined by Equation ( 7) can be written in a more compact way as follows: The relation ( 9) links on the time window the input and the output of the system to the initial state x(k − h) on the observation window.We introduce the following assumption.

Assumption 1. The observability matrices O μ,h of the paths μ generated on the observation window [k
The existence of an integer h, such that Assumption 1 holds, was analyzed in [19] and is linked to pathwise observability that has been furthermore shown to be decidable.
Using Assumption 1, one can define a projectionmatrix Ω μ,h in such a way that Ω μ,h O μ,h = 0, that is, Ω μ,h is selected as a basis for the left null space of O μ,h .(In fact, the existence of the projection matrix is directly linked to the observability of the system and to the length of the observation window [20].) Next, residuals r μ,h (•), independent of the initial state x(k − h), can be defined as The residuals r μ,h (•) are useful for the determination of the active path on the observation window and they only depend on measurable variables, namely, the system input and output.In fact, for the active path μ * , the residual r μ * ,h (•) equals zero.
Theorem 1.The active path μ * describing the true mode sequence on a time window To recover the true mode sequence μ * from the system measurements, one can proceed in the following way: (i) first, all the possible paths of length h are built on the time window [k − h, k].This is equivalent to finding all the matrices O μ,h ; (ii) knowing the matrices O μ,h , the projection matrices Ω μ,h are easily calculated; (iii) from the matrices O μ,h and Ω μ,h , one can form the residuals r μ,h (•) using the system measurements; (iv) the active path is recovered from the system measurements by testing the residuals r μ,h (•) and it corresponds to the one in which residual equals zero.
Theorem 1 implicitly says that the observability matrices O μ,h do not share the same null space.Section 3.2 will highlight the conditions that guarantee this implicit assumption.

On the number of paths
It is easy to see that the enumeration of all paths on a time window [k − h, k] introduces a problem of combinative explosion related to the number of modes and the length of the observation window.Indeed, the number of residuals r μ,h (•), μ ∈ Θ h , to be calculated is equal to s h and quickly grows with the length h + 1 of the observation window and the number s of modes.Then, the use of all paths on a time window is awkward and computationally demanding.
In practice, all paths μ ∈ Θ h do not have to be considered at every moment.When at a time k 0 the active path on an observation window [k 0 − h, k 0 ] is identified, it is not necessary to test the s h residuals at the next instant k 0 + 1.Only the paths μ ∈ Θ h with infixes μ [k0−h+1,k0−1] identical to the infix μ * [k0+1−h,k0−1] of the path μ * recovered previously at k 0 are considered at the next instant k 0 + 1.
Moreover, assuming that the minimum dwell time in a mode is greater than the length of the observation window, one can limit the number of generated paths by only considering paths that describe the mode sequence when the system remains in the same mode all over the duration of the observation window, that is, Nevertheless, the reduction of the number of residuals comes at the expense of a delay in the estimation of the switching time from one mode to another.The recognition of the active path cannot take place as long as the switching instant is in the observation window.Thus, when applying this reduction of the number of paths, a maximum delay equal to the length of the observation window exists.
Prior knowledge of the process, such as "prohibited" switching sequences or minimal time between two consecutive switches, can also help to limit the number of generated residuals or paths to be considered.
In a practical implementation, the methodology should be to first compute a reduced set of residuals composed of the residuals linked to paths that describe the mode sequence when the system remains in the same mode on the time window.From this initial set of residuals, a reduced set of residuals can be considered at each time instant, depending on the previously recovered path.This operation considerably reduces the computing load.

Discernability of the modes
In what follows, we are interested in the conditions guaranteeing the discernability of the various paths enumerated on an observation window.These conditions ensure the uniqueness of the recovered active path μ * during the path recognition process.Discernability guarantees that two different modes never induce the system in the same dynamics on a finite time window.Definition 4. Two paths μ 1 ∈ Θ h and μ 2 ∈ Θ h are discernible on an observation window [k − h, k] if their respective corresponding residuals r μ 1 ,h (•) and r μ 2 ,h (•) are not simultaneously null when one of the two paths is active on the considered observation window.
The study of paths discernability conditions have also been investigated by other authors like Babaali and Egerstedt [14], Hwang et al. [21], Vidal et al. [16].The difference here is that the study of the paths discernability conditions is not performed independently of the active mode observer but also takes into account the characteristics of the mode observer thanks to the analysis of the residuals r μ,h (•).
In order to establish the discernability conditions of two different paths, let us consider two paths k−h,k ) the system output vector when the active path is μ 1 (resp., μ 2 ).We suppose that at an instant k, the active path on the observation window is the path μ 1 .This information being unknown, we have to analyze the possibilities that the path μ 1 or the path μ 2 is in adequacy with the system data.From (10), the expressions of the residuals r μ 1 ,h (•) and r μ 2 ,h (•) are given by Since μ 1 is the active mode on the observation window, (12) can be written as and, by definition, one also has Adding and taking away Y As by definition Ω μ 2 ,h (Y Equation ( 16) clearly points out that the residual calculated for the path μ 2 (nonactive path) directly depends on the difference between the system outputs when the mode sequence evolves according to the two paths μ 1 and μ 2 , the system being excited by the same inputs in both cases.From ( 16), a necessary and sufficient condition for the discernability of the paths μ 1 and μ 2 is where N r stands for the operator "right null space."According to (9), one has where x(k − h) is the value of the system state at the initial instant of the observation window.One deduces from (18) after multiplication on the left by Ω μ 2 ,h that k−h,k belongs to the right null space of Ω μ 2 , one has The relation (20) is satisfied "for almost every initial state" (see Remark 1 for the explanation of the expression "for almost every initial state") x(k − h) if the following necessary and sufficient condition is satisfied: Therefore, the paths μ 1 and μ or The proof of this theorem directly comes from the preceding remarks.
When the paths μ 1 and μ 2 are of the type (i 2, . . ., s}, Theorem 2 is equivalent to the mode discernability conditions formulated in [11]. Remark 1.In Theorem 2, the expression "for almost every initial state" holds owing to the fact that the discernability of the paths cannot be ensured for any initial state x(k − h).In fact, for certain particular values of x(k − h), the relation ( 20) is always satisfied independently of the input sequence U k−h,k .For example, in the situation where O μ 1 ,h has full rank, for 20) is satisfied for every input sequence U k−h,k , where Φ belongs to the right null space of Ω μ 2 ,h and (O μ 1 ,h ) † is a pseudoinverse of O μ 1 ,h .

Determination of the active mode in a noisy environment
In Section 3.1, the determination of the active mode at any moment was carried out within a deterministic framework, that is, there was no noise on the system measurement.Now, we assume the presence of a bounded noise on the output of the system described by (5).The only available information on the noise is its maximum magnitude.No probabilistic assumption is formulated on the probability distribution of the measurement noise: where δ is the bound of the measurement noise magnitude n(•).
In this situation, the residual r μ * ,h (•), defined by (11) and which corresponds to the active path μ * on the time window [k − h, k], is no longer equal to zero.Indeed, the expression of the residual r μ * ,h (•), using (10), becomes where the values taken by the measurement noise on the observation window Using the bound of the measurement noise magnitude, we can define an interval residual [r μ * ,h (k)] [22]: where r μ * ,h and r μ * ,h depend on the bound δ of the measurement noise and are given by r μ * ,h = −|Ω μ * ,h |Uδ and r μ * ,h = |Ω μ * ,h |Uδ, U being a column vector of length equal to the number of columns of Ω μ * ,h and all the elements of U being equal to 1.
In an interval framework, the determination of the active path amounts to seeking the path that corresponds to an interval residual including the value zero.This test can be performed by calculating the sign of the product of the upper and lower bounds of each interval residual [r μ,h (•)].The interval residual [r μ,h (•)] associated with the active path μ * is the one for which the sign of the product of its upper and lower bounds is negative.
Depending on the evolution of the various operating regimes dynamics, it can happen that more than one interval residual contain the value zero, this situation being linked to the path discernability and the bound of the measurement noise magnitude.In this case, one refrains from making any decision on the active path.We have to consider this situation from a looser point of view and we can only enumerate the set of all possible active paths.However, considering successive time instants k + 1, k + 2, . . ., the situation may be clarified.
Note that it is also conceivable to introduce some probabilistic modeling assumptions on the output noise and then refer to a statistical test like the CUSUM algorithm [23] to recover the active path from the analysis of the generated residuals.

IDENTIFICATION OF THE SWITCHING LAW
Once the recognition of the active mode at every moment is performed, the next step is to proceed to the identification of the parameters of the switching law described by (5).
The identification of the switching law aims at finding a complete partition of the regressors set into s polyhedral regions such that μ k = i if ξ k ∈ H i , i ∈ {1, 2, . . ., s}.This problem amounts to separating s sets of points by means of linear classifiers (hyperplanes).Depending on the active mode estimation process, the resulting s sets of points may be linearly separable or not (due to noise or misclassification).In the literature, robust linear programming (RLP) [24] and support vector machines (SVM) [25] methods are employed.
We consider here another way to proceed to the determination of the parameters of the s polyhedral regions.An interval approach is adopted.The interval representation allows to look for a set of acceptable values for the switching law, this set being of a simple geometrical form.The computation of the set of all feasible separating hyperplanes is also useful to the aim of characterizing the model uncertainties.

Determination of the switching law parameters in an interval form
We assume that from model ( 5), one obtains a dataset D = {ξ T k , k = 1, . . ., N}.After proceeding to the active mode recognition, the dataset D can be partitioned into s classes C i , i = 1, . . ., s, using the following classification rule: From the classes C i , i = 1, . . ., s, the determination of the parameters H i , i = 1, . . ., s, amounts to separating the s classes using linear classifiers which are, in this case, hyperplanes.This can be done by either considering all the s classes together at the same time (one-against-all and all-together approaches) or considering them pairwise (one-against-one approach).Here, we adopt the one-against-one approach.The one-against-one approach considers all possible combinations of pair of classes.Let us consider two classes C i and C j with i / = j.To separate C i and C j , we need to compute a hyperplane where h i j ∈ R na+nb+1 .
Using the system description (5), one can write the relation (30) for any data ξ k belonging to C i or C j : where ν k = sign(h i j ϕ T k ) or alternatively: and I is a set containing the time instants at which the mode i or the mode j were detected during the mode recognition process.The constant N i j is the sum of the cardinal of C i and the cardinal of C j .Considering (30), for all the N i j data ξ k , one obtains a set of inequalities that can be expressed in the form of a linear matrix inequality: . . .
The resolution of the LMI (32) gives a domain of acceptable solutions to which belong the parameters h i j .Generally, solving (32) leads to a "complex" domain, that is, a domain described with a huge number of vertices.To reduce this complexity, one can look for a simpler polytopic form describing a reduced domain of acceptable solutions.Here, we look for a zonotope.For example, the first graph of Figure 1 represents the projection in R 2 of the found domain for the dataset in Table 1 with ϕ k = (y k−1 u k−1 1), h i j = (1 α β), α ∈ R, and β ∈ R.This domain is depicted in the plan {α, β} on the graph in Figure 1(a) and corresponds to the set of inequalities (33) obtained from (32): All the points belonging to this domain are particular acceptable solutions.The symbol "o" highlights one of those acceptable solutions and may be, for example, the one resulting from the implementation of an interior-point algorithm.The graph in Figure 1(b) presents a suboptimal solution (grey area) that simplifies the description of the found domain in the form of independent inequalities in respect to α and β : The determination of a zonotope characterizing the set of acceptable solutions is equivalent to the determination of the parameters h i j in an interval form.For that, many optimization criteria can be chosen.For example, one can force the widths of the intervals to be determined to be maximal while respecting the system constraints.The parameters h i j are described in an interval form by where h i j0 is the vector containing the centres of the searched intervals, r hij represents the half widths of the intervals and the variables λ (•) are bounded normalized variables that allow to take into account all the values inside an interval.The operator ⊗ performs a componentwise product of two vectors.We recall that, for any vector e ∈ R n , one has e ∞ = max 1≤i≤n |e i |, e i being the ith component of the vector e.The inequality holding on the vector r hij is a componentwise inequality.
Using the interval form of h i j (35), (30) can be rewritten as r hi,j > 0, Hence, one has r hi,j > 0.
(37) Finally, to find h i j0 et r hij , we have to look for intervals with maximal half widths while respecting the constraints (37).A natural choice can be to maximize the volume of the zonotope.This leads to the constrained optimization problem (38): where r hij,m is the mth component of the vector r hij .
One can then use classical algorithms in the field of optimization [26] for the resolution of (38).The constrained optimization problem (38) has to be solved for all pairs of classes {C i , C j }, i / = j and i, j ∈ {1, 2, . . ., s}.
Remark 2. It is clear that the aim of the presented method in this section is to propose a simpler description of a geometrical domain represented by a set of inequalities.Hence, the method is in some way independent of the linear separability of the classes and it will work, whether the classes are linearly separable or not, as long as the initial geometrical domain (32) exists.Moreover, the inner zonotopic approximation introduces some conservatism but this is not a huge drawback.It is the price to pay to obtain a very simple geometrical description of the initial domain (32).

EXAMPLE
We present here an academic example of a switching system.The simulated system is characterized by three modes and the matrices of the models describing the different modes are Therefore, the modes 1, 2, and 3 of the system are represented by second-order models with K 1 = −1.464,K 2 = 2.002, and K 3 = −0.514as respective gains and the couples (−0.211; 0.521), (0.691; −0.310), and (0.153; 0.410) as respective pairs of poles.The switching law is characterized by with h 12 = (1 0.51 0), h 13 = (0 1 0), and h 23 = (1 − 0.29 0). Figure 2 shows the input u(•), the output y(•), the state x(•), and the mode sequence μ (•) .The vertical dashed lines on the third graph of Figure 2 mark the time instants at which switches occur.The fourth graph plots the mode sequence described by the mode selection variable μ (•) .For instance, on the time windows [1,8,9,17], the system is, respectively, in the modes 1 and 2.
As Ω μ i ,h O μ j ,h / = 0, μ i , μ j ∈ Θ 2 , μ i / = μ j , Θ 2 being the set of all paths of length 2, the condition (22) of Theorem 2 is respected.Condition ( 23) is tested at every moment.If it is not satisfied, no decision is taken concerning the recognition of the active path.
In order to perform the determination of the active path at every moment from the system input and output signals, we consider an observation window of length 3. The set Θ 2 of all paths of length 2 on the observation window corresponds to the set of the nine paths in Table 2.
Figure 3 presents the evolution of the calculated residuals.The different graphs on the figure show the residuals  2. Only one of the nine residuals equals zero at each instant, the index (i• j) of this residual corresponding to the active path on the considered time window.For example, from time k = 1 to k = 6, the residual r (1•1),h (•) (first row and first column of Figure 3) equals zero, meaning that the path (1 • 1) is the active one on the observation window.Hence, the active mode on the time window [1,6] is the mode 1.At k = 7, only the residual r (1•2),h (•) equals zero, meaning that the path (1 • 2) becomes the active one.From there, and taking into account the length of the observation window, the occurrence of a switch at k = 9 is highlighted.The mode sequence (first graph of Figure 4) and its estimation (second graph of Figure 4) while analyzing the residuals are depicted on Figure 4.The figure shows that the mode sequence is exactly reconstructed.
As explained in Section 3.2, in order to reduce the number of residuals to be analyzed during the mode recognition process, one can consider only the paths describing the mode sequence when the system remains in the same mode all over the duration of the observation window.In this case, only the paths (1 • 1), (2 • 2), and (3 • 3) have to be considered.
For the system described by matrices (39), we proceed to the recognition of the active path in a noisy environment where the system output is subject to the effect of a bounded noise.In this situation, to make the analysis simpler, we only consider the three paths (1 • 1), (2 • 2), and (3 • 3).The active mode recognition is performed by analysing interval residuals as presented in Section 3.4.On Figure 5, the three interval residuals are shown in dashed lines.One can notice that only one of the three interval residuals includes at any moment the value zero, this residual being associated with the active path on the observation window.
The second and the third graphs of Figure 6 illustrate the results of the active path detection by analyzing the interval residuals.The second graph shows the modes detected while testing the membership of the value zero to the interval residuals.Although the modes are rather well detected, there are situations where it was impossible to provide an estimate of μ (•) because of the fact that more than one interval residual or none of the three interval residuals contain the value zero.On the second graph of Figure 6, the points, with Y -coordinate equal to zero, emphasize this kind of situation which is due to the presence of noise and to the fact that all the possible paths on the observation window are not considered in the analysis.The third graph of Figure 6 is obtained by testing the coherence in the succession of the detected active paths at consecutive moments.This is equivalent to the path reduction method presented in Section 3.2 using the infix of the detected active path.One can notice a perfect reconstruction of the mode sequence.
Once the process of the active mode recognition is performed, we can proceed now to the identification of the parameters of the switching law defined by (40).The dataset is assumed to be representative enough of the system's various operating regimes.The constrained optimization problem (38) is resolved by using an iterative algorithm.The results are presented in Table 3.
In Table 3, the centres of the intervals found are indicated by h (•) 0 , the half widths by r h(•), and the real values are represented by h (•) .From Table 3, the intervals found are (41) While analyzing the estimated values in Table 3, one can see that the estimated intervals for the switching law parameters
2 are not discernible on a time window [k − h, k] if the relations (21) are satisfied.
Theorem 2. Two paths μ 1 and μ 2 of a switching system are discernible on an observation window [k − h, k], "for almost every initial state" x(k − h), if

Table 2 :
Set of all paths of length 2.