Neural Net Gains Estimation Based on an Equivalent Model

1Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-IPN), Avenida Juan de Dios Bátiz, Esq. Miguel Othón de Mendizábal, Col. Nueva Industrial Vallejo, Delegación Gustavo A. Madero, 07738 Ciudad de México, DF, Mexico 2Centro de Investigación en Ciencia Aplicada y Tecnologı́a Avanzada U. Legaria, Instituto Politécnico Nacional (CICATA-Legaria-IPN), Calzada Legaria 649, Col. Irrigación, Delegación Miguel Hidalgo, 11500 Ciudad de México, DF, Mexico


Introduction
There are different techniques for modelling a system to identify the characteristics that make it an excellent approximation.When considering the physical behaviour of the real process, these models usually lead to complex nonlinear systems, which are difficult to analyse and are simplified through relations between their input and output signals, obtaining a Black-Box (BB) description [1].
The system evolution viewed as a BB has no access to its internal properties but only to the input and output responses, without paying attention to the internal parameters that have a dynamical evolution.However, the human experiences, in many cases, give the original answer or selection (a fair process) through the intuition (if then (fuzzy logic) inferences) selecting the new parameters.These experiences provide better tolerable approximations in combination with other theories, for example, Lyapunov, Sliding Modes, or Intelligent Systems.
Artificial Neural Nets (ANN) viewed as mathematical models, applied for complex systems [2,3], and inspired by biological neurons operations, generate a fast answer.Nevertheless, the computer devices, compared with the human brain, have different connotations because the methods considered are faster due to robust programming instead of chemical reactions in natural conditions.In addition, the human brain has the ability of self-programming and adaptability without requiring a new programming code, adjusting its required energy levels based on natural instincts using the intuition as a great tool.
An artificial neural model (Figure 1) based on a biological neuron principle implemented in a computational model is helpful in prediction and classification of problems, pattern recognition, signal processing, estimation, and control [4].In the case when more than a neuron is connected, we obtain a MISO (Multiple Inputs and a Single Output) neural net, and similar to human adaptability, the model considered adjusts its functional parameters using learning processes for different outputs depending on the stimuli.When analysing a system, its characteristics help us determine the best method according to our requirements, improving the convergence rate.In this paper, we compare a hybrid model and an analytical model.The first implies a fuzzy estimation combined with the Kalman filter description, and the second is optimal with systematic evaluation, considering the expected value obtained from the previous information.Both are based on an Equivalent Artificial Neural Net (EANN).

Equivalent Neural Net
In the biological sense, for neural-specific tasks, the neuron cell inputs require accomplishing adequate properties for generating a fire action in the neuron soma to obtain an accurate output in the neuron axon; in other cases, the biological system operates with the minimal energy required which allows not losing connection with other neurons.
In an ANN, we identify three principal sections: input, hidden, and output layers.The input layer is the interaction between the input signals and the first block of weights; its result becomes the input to the following coating.At the very first stage, the designer selects the weights intuitively and adjusts the following while looking for the desired response [5].The hidden layers have a set of inputs and outputs in different stages related through the weights.The output layer represents the convolution or binary sum of the last block of weights and its respective data.In an illustrative way, Figure 2 depicts an example of a typical ANN, with two hidden layers, input and output.
The EANN is a simplified representation of an ANN whose task is to obtain the parameters vector which allows the system to reach the desired reference signal without paying initial attention to the internal layers, focusing on the estimation procedure.It considers the total input-output signals relation trying to reduce unnecessary delays and always keeping the weight interconnection form, achieving the desired response.
As shown in Figure 2, input data is denoted by a set described as {  :  = 1, ,  ∈  + ,  ∈ }, and the output data is denoted by {  ∈ }, where  represents the number of input elements.In addition, the weights or parameters are considered as {   :  = 1, , ,  ∈  + } =1, , where  indicates the specific parameter number in a layer .The admitted layers are within a set of functions {{  , ,  = 1, ,  ∈  + } =1, } =1, , interconnecting directionally from an original parameter  into layer  to a target parameter  into layer ( + 1).Each weight from the input and output layers requires an activation function and all the hidden layers have proper activation functions connected to other weights to achieve different and specific requirements for each output stage.This description corresponds to Figure 3, where the traditional ANN connections have now simple flow diagram lines containing activation functions described as {{   ,  = 1, ; ,  ∈  + } =1, }, where  represents the function number and  represents the layer this functions leaves.
Figure 3 presents the activation functions from the first hidden layer ( = I) operating with an accumulative energy  = I convolved with an input   in agreement with the following equation: The set of pairs {(  ,   ) :  = 1, ,  ∈  + } represents activation limit functions for  I  .These limits denote the minimum and maximum required energy to excite a neuron for a specific weight   I , known as fire limits.In the following hidden layer (II),  II  requires that the set of inputs accomplishes the same requirements, considered for previous results; that is, In (2), the binary operator " * " represents the composition of the involved terms without indicating a particular operation.
The equivalent weights sequence allows each input to include the structure of the previous parameters in the final description.Each layer takes part in the following activation

Input layer
Hidden layers Output layer  function due to the interaction between the new weights and the previous composed output signal.Figure 4 shows the EANN model in the simplified form.
According to [6], each neuron output has a function whose parameters are the inputs and weights for the following layer.Equation (3) expresses the influence of the previously mentioned parameters, weights, and inputs to the following layer in a recursive form, where, instead of   ,    describes the operation ∑  =1         as the core neuron function: where  is a proportional constant adjusting the previous layer; therefore, the model converges to the neural net development.At the final coat, we have the convolution   = (( ∘ ) ∘ )   , which represents the neural net response.For computational applications, this reaction has the effect of an activation function, usually the sigmoid function.
A sophisticated ANN considers the integration of more than one EANN since its description allows using the recursive characteristics.In addition to this, the implementation of EANNs gives the possibility to restrict the number of necessary iterations to reach a reference, which is the remarkable feature in systems where time delays are considerable.

Equivalent Neural Net Using Arma Description
An ARMA (1, 1) (Autoregressive Moving Average) model is a tool used for obtaining the parameters matrix from a reference system viewed as a MISO BB; its primary structure is specified by ( 4) and ( 5), with  being the time evolution: where This model has observable (  ) and internal (   ) states, an input signal (  ), gains (, ), and internal gain ().The measurable state (6) in explicit form is a function of its immediate past, internal gain, and the inputs {  } =1, .Consider In [7] the internal state using the traditional Kalman filter (KF) is described even though the internal gain  and the gain   are still unknown.The complexity of the filter increases because after the identification the internal gain depends on the error, which has an application in (4) for obtaining the observable signal approximation in (5), represented in the discrete form in the following equation: By applying ( 7) in ( 4), including the still unknown internal state, we obtain ( 8) The internal state from (6) allows in ( 9) obtaining the internal lagged state as a measurable state function and output perturbations.Consider Considering ( 9) in ( 8) we determine the output in the following equation: Equation ( 11) represents a recursive form of (10) describing the reference system with an innovation process: In agreement with [8], the gain ( −1 ) with (12) corresponds to  −1 δŵ  ≈ ŷ − Â  −1 .The hybrid filter (13) considers the fuzzy parameter estimation, the gain description, and the lagged signal: With the innovation process and the reference system, bounded by the same general Membership Function (MF) [8,9], it is possible to estimate the explicit matrix parameters and the gain using the inference mechanisms considering the functional results and the noise properties, respectively.

Fuzzy Gains and Estimation Properties
In the fuzzy sense, [10] presented the parameters obtained by a controller, considering a fuzzy function vector for nonlinear systems.The MIMO system found firstly the linear representation formed by a collection of MISO systems with the same inputs, reducing and simplifying its analysis.
On [11], the hybrid combination required that the identification filter adjust the parameters automatically using fuzzy logic.This adjustment needs the selection of the best values with respect to the inference, minimizing the error convergence by using heuristic techniques or the Least Square Method (LSM).
The first step in the fuzzy estimation determines the reasoning levels in accordance to the proposed MF, identified through the reference signal statistical properties.There could be triangular, sinusoidal, and impulsive or Gaussian functions, among others, to define the ranges contained in the reference signal classification.
A set of fuzzy rules (if then) forms a Fuzzy Rule Base (FRB) to interpret what requirements and process conditions are needed.Previously, it is necessary to select and introduce the best values to the Knowledge Base (KB) according to the MF, actualizing the parameters according to the reference model limited by the filter error criteria.
Using the fuzzy logic connectors into the fuzzy stage, considering the desired signal (  ) and the region level with respect to (   ), reducing the inference operational levels and indicators in to the MF, and selecting from the KB parameters Â values actualise the hybrid filtering process.Each fuzzy filtering rule finds specific matrix parameters in each evolution [9,12].
In the same sense, the hybrid filter considers the basic principles of a conventional Kalman digital filter using the Mean Least Square Criterion (MLSC) described as    = ⟨  ,    ⟩ (1/2) and, in agreement with [5], in its recursive form: According to [9], (14) presents the adequate element describing the optimal matrix parameters.

Optimum Coefficient
For an ANN, to determine an optimal vector coefficient is necessary to consider minimizing the error and as the primary objective that the convergence tends to zero.One inconvenience is how long this optimal convergence will take to occur.A control for a recurrent NN, described in [13], was an optimum by adding an extra coefficient to compensate for the error within a small bound in an unknown necessary learning time.
Considering the fact that the last stage of a hybrid filter corresponds to the equivalent neural net from Figure 3, it is possible to determine the optimum parameters for the neural weights obtaining the best output approximation to the reference signal by an analytical process.
Based on BB concepts, the input signals {   } =1, represented by the matrix [×1] and the output of the system   are the known parameters.In this sense, we need a synthesis process to calculate the matrix values  [1×] representing the weights in the neural layer.
Having   =   as an ARMA model and the process considering stochastic properties, we use the mathematical expectation in the probabilistic sense obtaining information about the process.So Â fl {     }[{     }] + , where the symbols  and + represent the transpose and pseudoinverse operators, respectively.
If   is the reference signal which helps us get the parameters, then we apply these values to find the output  iden, , and their comparison gives the identification error   fl   − iden, and its functional error    fl ⟨  ,    ⟩ tending to zero due to the values being considered optimums.
To demonstrate this, from Figure 4, the output is observed as In addition, seeing   as the reference or target signal defined as we have the following form: Considering    is a stochastic input formed in distribution sense by {(   )  ⊆ (,  2 < ∞)} =1, , the parameters are represented by  and the output signal is represented by   ; the BB system scheme allows estimating the parameters set through its time evolution in a probabilistic sense.Consider Due to the weights being constants for an instant of execution time  and considering the mathematical expectation properties, it is possible to obtain the matrix estimation known as Â , indicating that this new array value is the matrix estimation.Consider For a discrete system (19) with infinite enumerable elements, the mathematical expectation has the following form: By replacing Â [1×] in (17), we obtain a new output state of   which we call identification symbolically described as  iden, ; it represents the output including the effects of the estimated weights values.
The difference between the identified signal and the reference signal gives the following identified error: In order to express (20) recursively, the first and second terms are replaced with   and   , respectively, defined as follows: Considering ( 22) and ( 23) in ( 20), ( 24) and its delayed form (25) for stable conditions are obtained: Developing ( 22) in recursive manner has the following equation: Considering stationary conditions for (22) delayed has Rewriting ( 26) in terms of ( 27), we have (28) and its block diagram representation shown in Figure 5: Expanding (28) and ordering with respect to  −1 , we have the following equation: Now, applying (29) in (24), we have the following estimation: Remembering that (25) in stationary conditions is the estimation delayed and applying it in (30) yields the following: Using (31) in (24), we obtain the parameter vector in recursive form (32).The block diagram representing Â parameter using (31) is in Figure 6.Consider where  −1 = (( − 1)/) −1  +  and  −1 = (1/)      +  .As (31) includes (23) in its description, it is necessary to build its recursive form similar to the obtainment of (28); then Computational Intelligence and Neuroscience we have (33) and its block diagram representation shown in Figure 7. Consider Finally, replacing (32) in (17), the identified output is the following equation: Figure 8 represents the interaction between the inputs and the resulting error, which has better convergence due to the null error, determined for an instant by the best parameters values.

Hybrid Mechanism of Inference
Figure 9 provides the block diagram of a hybrid filter that combines fuzzy inferences with the EANN ARMA model description, instead of the logical block from Figure 8, to determine the adequate matrix parameters.The reference model considered is a BB giving the reference signal   .The distribution curve of this signal denotes the intervals where the MF must be; then, the degree of membership obtained by Mamdani with fuzzy inferences has access to the Knowledge Base (KB), determining the parameters of the model, making the convergence, and minimizing the error in a distribution sense.

Results
The performed simulation considers a comparison between both methods giving a better idea of how they approximate to the reference signal.The reference model output   considered nonstationary conditions, noise sequences bounded by a distribution function, and, on average, constant mean expected value and variance.The variations in the signal have a periodic signal with smooth random perturbations.
The first part of the simulation considered the hybrid filter, applying inferences obtaining (13) as the signal output identification.Figure 10 shows the fuzzy inference process, where it is possible to identify the functional error given by ( 14), useful to estimate the coefficients for the ANN.The distribution curves defined the MFs having different operational levels represented through three and seven MFs, corresponding to   and    , respectively.

BB +
x These MFs are results of associated proper inference mechanisms to select parameters Â and gain   through the MFs and the KBs, affecting the final identified output ŷ .As an example, Figure 11 presents a three-dimensional KB integrated by sets: gain {  }, reference signal {  }, and functional error {   }.This KB helps us determine gain   through the reference and operational error considering our expertise.The KB for Â has a similar structure.
The analytical method uses the block diagram presented in Figure 8 having a delay in execution time due to the time state operations but within fewer process stages due to the fact that it does not require feedback from the functional error.
Our objective was to determine the internal parameters; Figure 12 compares the reference signal parameters to those estimated with both methods.The polar representation allows observing the components of the parameters; where it is possible to see, none of them leaves the unit circle.
When applying the estimation into the hybrid system, the response is as shown in Figure 13, which presents the response following the tendency of the reference.The analytical method provides the response in Figure 14.The previous graphics, Figures 12-14, were obtained considering a reference system with variable parameters and random noise.In order to better identify how the approximations converge to the reference, we have Figure 15 which presents a graphic segment showing more clearly both approximations to a reference system response, with also variable parameters but without random noise.
From Figure 15, Figure 16 compares the convergence considering the functional error (14) from both methods.In this case, the reference is near to zero as a constant value due to the estimations considered as optimum.

Conclusion
An Equivalent Artificial Neural Net (EANN) was considered describing its parameter through a Black Box (BB) analysis using two different approximations, hybrid and analytical techniques.
For the fuzzy estimation, the best option was to consider the error properties and, in this method, the response signal was adjusted according to the reference.The fuzzy evaluation allowed the description of the coefficients and gain which affect the Kalman filter, improving the identification process according to the Multiple Inputs and Single Output (MISO) model changes with perturbations.The parameter and gain selection, using an intelligent system with classification levels, allowed selecting into the KBs the best coefficients that positively affected the filter evolution.This method does not have an exact approximation, but it is good enough on average, as shown in Figure 12, and in distribution (Figures 13 and 15) if we consider that its response converged to a particular region different from zero.
The second method used the analytical approximation, converging at almost all points to the system parameters and the reference (Figures 12 and 14) so that the expected results were a minimum functional error through time.We considered that the null error corresponded to the low energy limit, which is not zero in the neurons to avoid the total loss of connection.This method had a better approximation to the reference but achieved the minimum error only in the numerable infinite.Even though this estimation does not consider the error feedback as the first method does, its response continues being adequate when external perturbations affect the system.
A sophisticated ANN could be represented by the integration of more than one EANN due to the fact that its description allows it to consider more than one layer because it has recursive characteristics.In addition to this, the implementation of EANNs gives the possibility of having more control on the number of necessary iterations to reach a reference; this is relevant to systems where restrictions in time delays are considered essential.
Globally, both methods presented good approximations, as shown in Figure 16, with unique characteristics identifying differences between the hybrid and analytical methods.

Figure 3 :
Figure 3: Simple description of an Artificial Neural Net through activation functions.

Figure 8 :Figure 9 :
Figure 8: Block diagram of the analytical process.

Figure 16 :
Figure 16: Convergence of the functional error.Comparison between the hybrid (magenta) and analytical (blue) evaluations.