Differential Neural Networks for Identification and Filtering in Nonlinear Dynamic Games

This paper deals with the problem of identifying and filtering a class of continuous-time nonlinear dynamic games (nonlinear differential games) subject to additive and undesired deterministic perturbations. Moreover, the mathematical model of this class is completely unknownwith the exception of the control actions of each player, and even though the deterministic noises are known, their power (or their effect) is not. Therefore, two differential neural networks are designed in order to obtain a feedback (perfect state) information pattern for the mentioned class of games. In this way, the stability conditions for two state identification errors and for a filtering error are established, the upper bounds of these errors are obtained, and two new learning laws for each neural network are suggested. Finally, an illustrating example shows the applicability of this approach.


Introduction
1.1.Preliminaries and Motivation.Nowadays, an investigation field that has been developed widely is the design of controllers for a certain group of systems involved in a conflicting interaction, that is to say, when the objective of each implicated system differs and when the known information about this interaction may be distinct for every system (e.g., [1]).
More formally, this special class of system groups can be represented by means of the dynamic noncooperative game theory, where the decision making process (or interaction) is called a game and each involved system in it is called a player [2].
From the viewpoint of control theory, a dynamic game is controlled by obtaining its equilibrium solution, and in order to do so, a (mathematical) model of this dynamic game is needed.So, several publications about dynamic games (and particularly about continuous-time dynamic games) are based on the complete knowledge of the model that describes its dynamics (see, e.g., [3,4]).Nevertheless, having a model (or even a partial model) of a continuous-time dynamic game is not always possible.
On the other hand, the equilibrium solution of a dynamic game is also based on the information structure that every player has or, in other words, on the available information that each player can use in the control strategy.For example, one can obtain an open-loop Nash equilibrium solution by using the maximum principle technique or a feedback Nash equilibrium solution by utilizing the dynamic programing method (see [2]).
According to the above, the aim of an identification process in terms of a dynamic game should be the modeling of such game and the obtaining of its information structure, that is, the guarantee that the control strategy of every player gives an equilibrium solution for a game despite its dynamic uncertainties.In this way, the works of [5,6] obtain feedback control strategies for differential games modeled through norm-bounded uncertainties; the studies of [7][8][9] achieve equilibrium solutions for several classes of differential games under a multimodel approach; the analyses of [10,11] present adaptive algorithms for determining equilibrium solutions without the complete knowledge of the game dynamics, and the work of [12] proposes the obtaining of a suboptimal equilibrium solution where a differential game is approximated by a fuzzy model.

Mathematical Problems in Engineering
However, an identification process could be deficient if there exist undesired perturbations in the game dynamics, that is to say, if the obtained information structure of the game is corrupted by (deterministic) noises.Thereby, some deterministic filtering works have been published in order to solve this issue.For example, [13] describes the concept of adaptive noise canceling for the estimation of signals corrupted by additive perturbations; [14] introduces a finite horizon robust  ∞ filtering method that provides a guaranteed bound for the estimation error in the presence of both parameter uncertainty and a known input signal; [15] presents the filtering of the states of a time-varying uncertain linear system with deterministic disturbances of limited power; and [16] derives an optimal filtering formula for linear time-varying discrete systems with unknown inputs.
Therefore, according to these preliminary works, the motivation of this paper is to solve the problem of identifying and filtering a class of nonlinear differential games with additive deterministic perturbations, where the mathematical model of this class and the effect (or power) of the noises are completely unknown.

Main Contribution.
Since the introduction of continuous-time recurrent neural networks (see [17]), the selfnamed differential neural networks have proved to be an excellent tool in the identification, state estimation, and control of several systems and of the appointed continuoustime dynamic games.
For example, in [18], differential neural networks are used for the identification of dynamical systems; references [19][20][21] design differential neural network observers for adaptive state estimation; the works of [22,23] propose neural network controllers for several applications; and [24] shows a compendium of differential neural networks for identification, state estimation, and control of nonlinear systems.Also, [25] treats the state estimation problem for affine nonlinear differential games using a differential neural network observer, and in [26], a nearly optimal Nash equilibrium for classes of deterministic and stochastic nonlinear differential games is obtained using differential neural networks.
Moreover, speaking of recurrent neural networks and deterministic filtering, the work of [27] develops a recurrent neural network for robust optimal filter design in a  ∞ approach, and in [28], algorithms are presented in order to obtain adaptive filtering in nonlinear dynamic systems approximated by neural networks.
Nevertheless, the idea of using differential neural networks for identification and filtering of a class of continuoustime nonlinear dynamic games is a new approach that, as far as the authors know, has not been treated before.
Hence, the main contribution of this paper is the proof that it is possible to identify and to filter the states of a certain class of nonlinear differential games through the designing of two differential neural networks.Moreover, this filtered identification process generates a feedback (perfect state) information pattern for the mentioned class of games.
More specifically, although the structure of this class is known, its mathematical model is not; that is to say, the only available information of the nonlinear differential game is the control actions of each player.So, by using only this available information, a first differential neural network will be designed for the identification of the nonlinear dynamic game with the undesired perturbations, and, similarly, the second differential neural network will identify the effect of these additive noises in the dynamics of the nonlinear differential game, which means that the perturbations are known but their power is not.
According to the above, one of these two differential neural networks will do the complete identification process of the class of nonlinear differential games, and, noticeably, the filtering (or the canceling) of the undesired perturbations will be held by subtracting the state estimates of the two differential neural networks.
Finally, it is important to emphasize that these two differential neural networks have the structure of multilayer perceptrons (see [23,24,29]), and, by means of Lyapunov's second method of stability, the learning laws for their synaptic weights are derived.

Class of Nonlinear Differential Games
Consider the following continuous-time nonlinear dynamic game given by where  ∈ [0, ∞); the index  = 1, 2, 3, . . .,  denotes the number of players;   ∈ R  is the state vector of the game;    ∈   adm ⊆ R   denotes the admissible control action vector of each player; the mappings  : R  → R  and   : R  →   adm are unknown nonlinear functions;   ∈ R  denotes a known deterministic perturbation vector; and  is an unknown constant matrix of adequate dimensions.
Similarly, consider now the following set of cost functions (or performance indexes) associated with each player and given by where   : R  ×   adm → R is well-defined for the  player.Moreover, the information structure of each player (denoted by    ) has a standard feedback (perfect state) pattern; that is to say, and a permissible control strategy (or control policy) of the  player is defined by the set of functions   (  ) satisfying Nevertheless, the class of nonlinear differential games (1)-( 4) is not completely described if the following assumptions are not fulfilled.
where  ∈ R is a known constant.

Problem Statement
Let the class of continuous-time nonlinear dynamic games (1)-(4) be such that Assumptions 1 to 4 are fulfilled.Then, if one makes the following change of variables: it is clear that the expected or uncorrupted vector state of the class of nonlinear differential games (1)-( 4) can be defined as Thereby, and in view of the fact that (⋅),   (⋅), and  are unknown, the tackled problem in this paper is to obtain a feedback (perfect state) information pattern    = {x  }, given that x ∈ R  satisfies the following filtering (or noise canceling) equation: and where ŷ , ẑ ∈ R  are the state estimates of differential equations ( 10) and (11), respectively.

Differential Neural Networks Design
In order to solve the problem described above, consider a first differential neural network given by the following equation: where  ∈ [0, ∞);  = 1, 2, 3, . . .,  denotes the number of players; ŷ ∈ R  is the vector state of the neural network; the matrices and   2, ∈ R   × are synaptic weights of the neural network; and  : R  → R  and   : R   → R   ×  are activation functions of the neural network.
According to [29], the differential neural network ( 14) is classified as multilayer perceptrons and its structure was initially taken from [23,24].Also, this differential neural network only uses sigmoid activation functions; that is,  and   have a diagonal structure with elements where 0 <   ,   ,   ∈ R and 0 <   ,   ,   ∈ R are known constants that manipulate the geometry of the sigmoid function.Thus, in view of the fact that  and   are sigmoid activation functions, they are bounded and they satisfy the following equations: where , and    are known constants of adequate dimensions, and Nevertheless, there are some design conditions that this first differential neural network needs to satisfy.Assumption 6.According to Assumption 2, the approximation or residual error of ( 14) corresponding to the unidentified dynamics of (10) and that is given by where  1,0 ,  1,0 ,   2,0 , and   2,0 are initial synaptic weights (when  = 0), is bounded and satisfies the following inequality: where Remark 7. The constants  1 and  2 in ( 22) are not known a priori because of the fact that they depend on the performance of the differential neural network (14); that is to say, the residual error (21) will depend on the number of neurons used in (14) and on its parameters, and therefore  1 and  2 will depend on this too (see [24]).
On the other hand, consider now a second differential neural network given by the following equation: where  ∈ [0, ∞); ẑ ∈ R  is the vector state; the matrices  3, ∈ R × and  4, ∈ R × are synaptic weights; and  : R  → R  and  : R  → R × are sigmoid activation functions.
Then, similar to  and   , the activation functions (⋅) and (⋅) satisfy the following inequalities: where , Λ α, Λ  , Λ γ, and Λ  are known constants of adequate dimensions, and Thereby, it is easy to confirm that if , and if  1, =   2, = , where  = 1, then the differential neural networks ( 14) and ( 27) coincide.Hence, two new assumptions (corresponding to Assumptions 6 and 8) must be satisfied for this second differential neural network.
Assumption 9.According (again) to Assumption 2, the approximation or residual error of (27) corresponding to the unidentified dynamics of (11) and that is given by where  3,0 and  4,0 are initial synaptic weights (when  = 0), is bounded and satisfies the inequality where 0 < Λ  ∈ R × is a known constant and 0 <  1 ,  2 ∈ R.
Remark 10.Similar to Remark 7, the constants  1 and  2 in (31) are not known a priori because they depend on the performance of the differential neural network (27).
Assumption 11.If there exist values of Λ α, Λ γ, Λ  ,  3,0 , and  4,0 such that and values of Λ  and Λ  such that then they provide a 0 <  = []  ∈ R × solution to the algebraic Riccati equation (23), where Remark 12. Notice that the equalities (32) and (33) were chosen in order to guarantee the same solution of the algebraic Riccati equation (23) in Assumptions 8 and 11.Also, notice that the first term of the right-hand side of (26) will always be greater than the first term of the right-hand side of (36); that is, 2 > .

Main Result on Identification and Filtering
According to the above, the main result on identification and filtering for the class of nonlinear differential games (1)-( 4) deals with both the development of an adaptive learning law for the synaptic weights of the differential neural networks ( 14) and ( 27) and the inference of a maximum value of identification error for the dynamics (10) and (11).Moreover, the establishment of a maximum value of filtering error between the uncorrupted states and the identified ones is obtained; namely, an error is defined by where x ∈ R  is given by noise canceling equation ( 13) and   ∈ R  is given by the expected or uncorrupted vector state (12).More formally, the main obtained result is described in the following three theorems.4) be such that Assumptions 1 to 4 are fulfilled.Also, let the differential neural network (14) be such that Assumptions 6 and 8 are satisfied.If the synaptic weights of (14) are adjusted with the following learning law:

Theorem 13. Let the class of continuous-time nonlinear dynamic games (1)-(
where denotes the identification error, Ṽ1, := Proof.Taking into account the residual error (21), the differential equation ( 10) can be expressed as Then, by substituting ( 14) and (41) into the derivative of (39) with respect to  and by adding and subtracting the terms where W1, :=  1, −  1,0 and W 2, :=   2, −   2,0 .Now, let the Lyapunov (energetic) candidate function Mathematical Problems in Engineering be such that the inequalities (8) are fulfilled (see Assumption 3), and let be the derivative of (43) with respect to .Then, by substituting (42) into the second term of the right-hand side of (44) and by adding and subtracting the term [2[  ]    ], one may get Next, by analyzing the first five terms of the right-hand side of (45) with the following inequality: which is valid for any pair of matrices ,  ∈ R Ω×Γ and for any constant matrix 0 < Λ ∈ R Ω×Ω , where Ω and Γ are positive integers (see [24]), the following is obtained.
(i) Using ( 46) and (17) in the first term of the right-hand side of (45), then (ii) Substituting ( 18) into the second term of the righthand side of (45) and using ( 46) and ( 19), then (iii) Using ( 46) and ( 17) in the third term of the right-hand side of (45), then (iv) Substituting ( 18) into the fourth term of the righthand side of (45) and using ( 46) and ( 19), then (v) Using ( 46) and (22) in the fifth term of the right-hand side of (45), then Mathematical Problems in Engineering 7 where the algebraic Riccati equation in the first term of the right-hand side of (52) is described in (23) and in ( 24)-( 26), and Thereby, by equating (53) to zero, that is and by, respectively, solving for Ẇ 1, , V 1, , Ẇ 2, , and V 2, , the learning law given by ( 38) is obtained.Now, by choosing and by solving the algebraic Riccati equation ( 23), one may get Thus, by integrating both sides of (56) on the time interval [0, ], the following is obtained: and by dividing (57) by , it is easy to verify that Finally, by calculating the upper limit as  → ∞, the maximum value of identification error in average sense is the one described in (40).This means that the identification error is bounded between zero and (40), and, therefore, this means that   is stable in the sense of Lyapunov.
Theorem 14.Let the class of continuous-time nonlinear dynamic games (1)-( 4) be such that Assumptions 1 to 4 are fulfilled.Also, let the differential neural network (27) be such that Assumptions 9 and 11 are satisfied.If the synaptic weights of (27) are adjusted with the following learning law: where denotes the identification error and   3 and    4 are known symmetric and positive-definite constant matrices, then it is possible to obtain the next maximum value of identification error in average sense    2 as follows: Proof.Taking into account the residual error (30), the differential equation ( 11) can be expressed as Then, by substituting ( 27) and ( 62) into the derivative of (60) with respect to , and by adding and subtracting the terms [ 3,0 (ẑ)] and [ 4,0 (ẑ)  ], it is easy to confirm that where W3, :=  3, −  3,0 and W4, :=  4, −  4,0 .Now, let the Lyapunov (energetic) candidate function be such that the inequalities (8) are fulfilled (see Assumption 3), and let be the derivative of (64) with respect to .Then, by substituting (63) into the second term of the right-hand side of (65) and by adding and subtracting the term [2[  ]     ], one may get Next, by analyzing the first three terms of the right-hand side of (66) with the inequality (46), the following is obtained.
(i) Using ( 46) and (28) in the first term of the right-hand side of (66), then (ii) Using ( 46) and ( 28) in the second term of the righthand side of (66), then (iii) Using ( 46) and (31) in the third term of the right-hand side of (66), then So, by substituting (67)-(69) into the right-hand side of (66) and by adding and subtracting the term inequality (65) can be expressed as where the algebraic Riccati equation in the first term of the right-hand side of (70) is described in (23) and in (34)-(36), and Thereby, by equating (71) and (72) to zero (Ψ  3 = Ψ  4 = 0) and by, respectively, solving for Ẇ 3, and Ẇ 4, , the learning law given by (59) is obtained.Now, by choosing and by solving the algebraic Riccati equation ( 23), one may get Thus, by integrating both sides of (74) on the time interval [0, ], the following is obtained: and by dividing (75) by , it is easy to verify that Finally, by calculating the upper limit as  → ∞, the maximum value of identification error in average sense is the one described in (61).
Theorem 15.Let the class of continuous-time nonlinear dynamic games (1)-( 4) be such that Assumptions 1 to 4 are fulfilled.Also, let the differential neural networks (14) and (27) be such that the Assumptions 6 and 8 and Assumptions 9 and 11 are satisfied.If the synaptic weights of ( 14) and ( 27) are, respectively, adjusted with the learning laws (38) and (59), then it is possible to obtain the following maximum value of filtering error in average sense: where   is given by (37).
Proof.Consider the following Lyapunov (energetic) candidate function: where  is the solution of the algebraic Riccati equation (23) with , , and  defined by ( 24)-( 26) or (34)-(36).Then, by calculating the derivative of (79) with respect to  and by considering the equations (37), (13), and ( 12), it can be verified that and, taking into account (42), (63), and the proofs of Theorems 13 and 14, one may get Thus, by integrating both sides of (81) on the time interval [0, ], the following is obtained: and by dividing (82) by , it is easy to confirm that By calculating the upper limit as  → ∞, equation ( 83) can be expressed as lim sup and, finally, it is clear that (i) if  1 >  1 , the maximum value of filtering error in average sense is the one described in (77); (ii) if  1 ≤  1 , the maximum value of filtering error in average sense is the one described in (78) given that the left-hand side of the inequality (84) cannot be negative.
Hence, the theorem is proved.

Illustrating Example
Example 16.Consider a 2-player nonlinear differential game given by subject to the change of variables ( 10) and ( 11) ( = 2), where the control actions of each player are The undesired deterministic perturbations are and H() denotes the Heaviside step function or unit step signal.Then, under Assumptions 1-4, 6, 8, 9 and 11, it is possible to obtain a filtered feedback (perfect state) information pattern    = {x  }.
Remark 17.As told before, the functions (⋅) and   (⋅) as well as the matrix  are unknown, but in order to make an example of (85) and its simulation, the values of these "unknown" parameters will be shown in the Appendix of this paper.
Thereby, consider now a first differential neural network given by ẏ  =  1,  ( By proposing the values described in Assumption 8 as and by choosing   1 = 100 and Then, by applying the learning law (38) described in Theorem 13, the value of the identification error in average sense (40) on a time period of 25 seconds is lim sup Remark 18.Even though 0.03145 is the value of  1 on the time interval [0, 25], it is not the global minimum value that  1 can take since  does not tend to infinity.So, in order to find this global minimum value, it would be required to simulate an arbitrarily large time.
On the other hand, let a second differential neural network given by ż  =  3,  (ẑ  ) +  4,  (ẑ  )   (94) be such that the equalities (32) and (33) are fulfilled; that is, By choosing   3 = 100000 and   4 = 1, the solution of the algebraic Riccati equation ( 23) results in the matrix (92), and by applying the learning law (59) described in Theorem 14, the value of the identification error in average sense (61) on a time period of 25 seconds is lim sup Remark 19.As in Remark 18, in order to find the global minimum value of  1 , it would be required to simulate an arbitrarily large time.
Finally, taking into account the identification errors (93) and (97), the value of the filtering error in average sense (77)-(78) on a time period of 25 seconds is lim sup The simulation of this example was made using the MATLAB and Simulink platforms and its results are shown in Figures 1-4.
As is seen in Figure 1, the differential neural network (88) can perform the identification of the continuous-time nonlinear dynamic game (85), where  1, and  2, (or, similarly,  1, and  2, ) denote the state variables with undesired perturbations and ŷ1, and ŷ2, indicate their state estimates.
On the other hand, according to Figure 2, the differential neural network (94) identifies the dynamics of the additive deterministic noises or, in other words, the dynamics of ż =   .Thus,  1, and  2, represent the state variables of the above differential equation and ẑ1, and ẑ2, are their state estimates.
In this way, Figure 3 shows the performance of the filtering process (13) in the nonlinear differential game (85); that is to say, it shows the comparison between the expected or uncorrupted state variables  1, and  2, and the filtered state estimates x1, and x2, .Finally, Figure 4 also exhibits the described filtering process but comparing the real state variables  1, and  2, with filtered state estimates x1, and x2, .

Results Analysis and Discussion
Although the differential neural networks ( 14) and ( 27) can perform the identification of the differential equations (10) and (11), it is important to remember that their performance depends on the number of neurons used and on the proposition of all the constant values (or free parameters) that were described in Assumptions 8 and 11.Therefore, the values of  1 and  1 (at a fixed time) can change according to this fact.
In other words, due to the fact that the differential neural networks are an approximation of a dynamic system (or game), there always will be a residual error that will depend on this approximation.Thus, in the particular case shown in Example 16, the design of (88) was made using only eight neurons, four for the layer of perceptrons without any relation to the players and two for the layer of perceptrons of each player.Similarly, for the design of (94) (which is subject to the equalities (32)-(33)), six neurons were used.
On the other hand, it is important to mention that ( 14) and (27) will operate properly only if Assumptions 1-4, 6, 8, 9 and 11 are satisfied; that is to say, there is no guarantee that these differential neural networks perform good identification and filtering processes if, for example, the class of nonlinear differential games (1)-( 4) has a stochastic nature.
Finally, although this paper presents a new approach for identification and filtering of nonlinear dynamic games, it should be emphasized that there exist other techniques that might solve the problem treated here, for example, the cited publications in Section 1.

Conclusions and Future Work
According to the results of this paper, the differential neural networks ( 14) and ( 27) solve the problem of identifying and filtering the class of nonlinear differential games (1)-(4).
Thereby, the proposed learning laws (38) and (59) obtain the maximum values of identification error in average sense (40) and (61), that is to say, the maximum value of filtering error (77)-(78).
Nevertheless, these errors depend on both the number of neurons used in the differential neural networks ( 14) and (27) and the proposed values applied in their design conditions.
On the other hand, according to the simulation results of the illustrating example, the effectiveness of ( 14) and ( 27) is shown and the applicability of Theorems 13, 14, and 15 is verified.
Finally, speaking of future work in this research field, one can analyze and discuss the use of ( 14) and ( 27) for the obtaining of equilibrium solutions in the class of nonlinear differential games (1)-( 4), that is to say, the use of differential neural networks for controlling nonlinear differential games.

Figure 3 :
Figure 3: Comparison between   and x for the 2-player nonlinear differential game (85).

Figure 4 :
Figure 4: Comparison between   and x for the 2-player nonlinear differential game (85).