Neural Net Gains Estimation Based on an Equivalent Model

A model of an Equivalent Artificial Neural Net (EANN) describes the gains set, viewed as parameters in a layer, and this consideration is a reproducible process, applicable to a neuron in a neural net (NN). The EANN helps to estimate the NN gains or parameters, so we propose two methods to determine them. The first considers a fuzzy inference combined with the traditional Kalman filter, obtaining the equivalent model and estimating in a fuzzy sense the gains matrix A and the proper gain K into the traditional filter identification. The second develops a direct estimation in state space, describing an EANN using the expected value and the recursive description of the gains estimation. Finally, a comparison of both descriptions is performed; highlighting the analytical method describes the neural net coefficients in a direct form, whereas the other technique requires selecting into the Knowledge Base (KB) the factors based on the functional error and the reference signal built with the past information of the system.


Introduction
There are different techniques for modelling a system to identify the characteristics that make it an excellent approximation. When considering the physical behaviour of the real process, these models usually lead to complex nonlinear systems, which are difficult to analyse and are simplified through relations between their input and output signals, obtaining a Black-Box (BB) description [1].
The system evolution viewed as a BB has no access to its internal properties but only to the input and output responses, without paying attention to the internal parameters that have a dynamical evolution. However, the human experiences, in many cases, give the original answer or selection (a fair process) through the intuition (if then (fuzzy logic) inferences) selecting the new parameters. These experiences provide better tolerable approximations in combination with other theories, for example, Lyapunov, Sliding Modes, or Intelligent Systems.
Artificial Neural Nets (ANN) viewed as mathematical models, applied for complex systems [2,3], and inspired by biological neurons operations, generate a fast answer. Nevertheless, the computer devices, compared with the human brain, have different connotations because the methods considered are faster due to robust programming instead of chemical reactions in natural conditions. In addition, the human brain has the ability of self-programming and adaptability without requiring a new programming code, adjusting its required energy levels based on natural instincts using the intuition as a great tool. An artificial neural model (Figure 1) based on a biological neuron principle implemented in a computational model is helpful in prediction and classification of problems, pattern recognition, signal processing, estimation, and control [4]. In the case when more than a neuron is connected, we obtain a MISO (Multiple Inputs and a Single Output) neural net, and similar to human adaptability, the model considered adjusts its functional parameters using learning processes for different outputs depending on the stimuli. When analysing a system, its characteristics help us determine the best method according to our requirements, improving the convergence rate. In this paper, we compare a hybrid model and an analytical model. The first implies a fuzzy estimation combined with the Kalman filter description, and the second is optimal with systematic evaluation, considering the expected value obtained from the previous information. Both are based on an Equivalent Artificial Neural Net (EANN).

Equivalent Neural Net
In the biological sense, for neural-specific tasks, the neuron cell inputs require accomplishing adequate properties for generating a fire action in the neuron soma to obtain an accurate output in the neuron axon; in other cases, the biological system operates with the minimal energy required which allows not losing connection with other neurons.
In an ANN, we identify three principal sections: input, hidden, and output layers. The input layer is the interaction between the input signals and the first block of weights; its result becomes the input to the following coating. At the very first stage, the designer selects the weights intuitively and adjusts the following while looking for the desired response [5]. The hidden layers have a set of inputs and outputs in different stages related through the weights. The output layer represents the convolution or binary sum of the last block of weights and its respective data. In an illustrative way, Figure 2 depicts an example of a typical ANN, with two hidden layers, input and output.
The EANN is a simplified representation of an ANN whose task is to obtain the parameters vector which allows the system to reach the desired reference signal without paying initial attention to the internal layers, focusing on the estimation procedure. It considers the total input-output signals relation trying to reduce unnecessary delays and always keeping the weight interconnection form, achieving the desired response.
As shown in Figure 2, input data is denoted by a set described as { : = 1, , ∈ + , ∈ }, and the output data is denoted by { ∈ }, where represents the number of input elements. In addition, the weights or parameters are considered as { : = 1, , , ∈ + } =1, , where indicates the specific parameter number in a layer . The admitted layers are within a set of functions {{ , , = 1, , ∈ + } =1, } =1, , interconnecting directionally from an original parameter into layer to a target parameter into layer ( + 1). Each weight from the input and output layers requires an activation function and all the hidden layers have proper activation functions connected to other weights to achieve different and specific requirements for each output stage. This description corresponds to Figure 3, where the traditional ANN connections have now simple flow diagram lines containing activation functions described as {{ , = 1, ; , ∈ + } =1, }, where represents the function number and represents the layer this functions leaves. Figure 3 presents the activation functions from the first hidden layer ( = I) operating with an accumulative energy = I convolved with an input in agreement with the following equation: The set of pairs {( , ) : = 1, , ∈ + } represents activation limit functions for I . These limits denote the minimum and maximum required energy to excite a neuron for a specific weight I , known as fire limits. In the following hidden layer (II), II requires that the set of inputs accomplishes the same requirements, considered for previous results; that is, in other cases. (2) In (2), the binary operator " * " represents the composition of the involved terms without indicating a particular operation.
The equivalent weights sequence allows each input to include the structure of the previous parameters in the final description. Each layer takes part in the following activation    Figure 3: Simple description of an Artificial Neural Net through activation functions.
function due to the interaction between the new weights and the previous composed output signal. Figure 4 shows the EANN model in the simplified form. According to [6], each neuron output has a function whose parameters are the inputs and weights for the following layer. Equation (3) expresses the influence of the previously mentioned parameters, weights, and inputs to the following layer in a recursive form, where, instead of , describes the operation ∑ =1 as the core neuron function: where is a proportional constant adjusting the previous layer; therefore, the model converges to the neural net development. At the final coat, we have the convolution = (( ∘ ) ∘ ) , which represents the neural net response. For computational applications, this reaction has the effect of an activation function, usually the sigmoid function.
A sophisticated ANN considers the integration of more than one EANN since its description allows using the recursive characteristics. In addition to this, the implementation of EANNs gives the possibility to restrict the number of necessary iterations to reach a reference, which is the remarkable feature in systems where time delays are considerable.

Equivalent Neural Net Using Arma Description
An ARMA (1, 1) (Autoregressive Moving Average) model is a tool used for obtaining the parameters matrix from a reference system viewed as a MISO BB; its primary structure is specified by (4) and (5), with being the time evolution: where

Computational Intelligence and Neuroscience
This model has observable ( ) and internal ( ) states, an input signal ( ), gains ( , ), and internal gain ( ). The measurable state (6) in explicit form is a function of its immediate past, internal gain, and the inputs { } =1, . Consider In [7] the internal state using the traditional Kalman filter (KF) is described even though the internal gain and the gain are still unknown. The complexity of the filter increases because after the identification the internal gain depends on the error, which has an application in (4) for obtaining the observable signal approximation in (5), represented in the discrete form in the following equation: By applying (7) in (4), including the still unknown internal state, we obtain (8) The internal state from (6) allows in (9) obtaining the internal lagged state as a measurable state function and output perturbations. Consider Considering (9) in (8) we determine the output in the following equation: Equation (11) represents a recursive form of (10) describing the reference system with an innovation process: In agreement with [8], the gain ( −1 ) with (12) corresponds to −1̂̂≈̂−̂−1 . The hybrid filter (13) considers the fuzzy parameter estimation, the gain description, and the lagged signal: With the innovation process and the reference system, bounded by the same general Membership Function (MF) [8,9], it is possible to estimate the explicit matrix parameters and the gain using the inference mechanisms considering the functional results and the noise properties, respectively.

Fuzzy Gains and Estimation Properties
In the fuzzy sense, [10] presented the parameters obtained by a controller, considering a fuzzy function vector for nonlinear systems. The MIMO system found firstly the linear representation formed by a collection of MISO systems with the same inputs, reducing and simplifying its analysis.
On [11], the hybrid combination required that the identification filter adjust the parameters automatically using fuzzy logic. This adjustment needs the selection of the best values with respect to the inference, minimizing the error convergence by using heuristic techniques or the Least Square Method (LSM).
The first step in the fuzzy estimation determines the reasoning levels in accordance to the proposed MF, identified through the reference signal statistical properties. There could be triangular, sinusoidal, and impulsive or Gaussian functions, among others, to define the ranges contained in the reference signal classification.
A set of fuzzy rules (if then) forms a Fuzzy Rule Base (FRB) to interpret what requirements and process conditions are needed. Previously, it is necessary to select and introduce the best values to the Knowledge Base (KB) according to the MF, actualizing the parameters according to the reference model limited by the filter error criteria.
Using the fuzzy logic connectors into the fuzzy stage, considering the desired signal ( ) and the region level with respect to ( ), reducing the inference operational levels and indicators in to the MF, and selecting from the KB parameterŝvalues actualise the hybrid filtering process. Each fuzzy filtering rule finds specific matrix parameters in each evolution [9,12].
In the same sense, the hybrid filter considers the basic principles of a conventional Kalman digital filter using the Mean Least Square Criterion (MLSC) described as = ⟨ , ⟩ (1/2) and, in agreement with [5], in its recursive form: According to [9], (14) presents the adequate element describing the optimal matrix parameters.

Optimum Coefficient
For an ANN, to determine an optimal vector coefficient is necessary to consider minimizing the error and as the primary objective that the convergence tends to zero. One inconvenience is how long this optimal convergence will take to occur. A control for a recurrent NN, described in [13], was an optimum by adding an extra coefficient to compensate for the error within a small bound in an unknown necessary learning time.
Considering the fact that the last stage of a hybrid filter corresponds to the equivalent neural net from Figure 3, it is possible to determine the optimum parameters for the neural weights obtaining the best output approximation to the reference signal by an analytical process.
Based on BB concepts, the input signals { } =1, represented by the matrix [ ×1] and the output of the system are the known parameters. In this sense, we need a synthesis process to calculate the matrix values [1× ] representing the weights in the neural layer.
Having = as an ARMA model and the process considering stochastic properties, we use the mathematical expectation in the probabilistic sense obtaining information If is the reference signal which helps us get the parameters, then we apply these values to find the output iden, , and their comparison gives the identification error fl − iden, and its functional error fl ⟨ , ⟩ tending to zero due to the values being considered optimums.
To demonstrate this, from Figure 4, the output is observed as In addition, seeing as the reference or target signal defined as we have the following form: Considering is a stochastic input formed in distribution sense by {( ) ⊆ ( , 2 < ∞)} =1, , the parameters are represented by and the output signal is represented by ; the BB system scheme allows estimating the parameters set through its time evolution in a probabilistic sense. Consider Due to the weights being constants for an instant of execution time and considering the mathematical expectation properties, it is possible to obtain the matrix estimation known aŝ, indicating that this new array value is the matrix estimation. Consider (19) For a discrete system (19) with infinite enumerable elements, the mathematical expectation has the following form: . (20) By replacinĝ[ 1× ] in (17), we obtain a new output state of which we call identification symbolically described as iden, ; it represents the output including the effects of the estimated weights values.
The difference between the identified signal and the reference signal gives the following identified error: In order to express (20) recursively, the first and second terms are replaced with and , respectively, defined as follows: Considering (22) and (23) in (20), (24) and its delayed form (25) for stable conditions are obtained: Developing (22) in recursive manner has the following equation: Considering stationary conditions for (22) delayed has Rewriting (26) in terms of (27), we have (28) and its block diagram representation shown in Figure 5: Expanding (28) and ordering with respect to −1 , we have the following equation: Now, applying (29) in (24), we have the following estimation: Remembering that (25) in stationary conditions is the estimation delayed and applying it in (30) yields the following: Using (31) in (24), we obtain the parameter vector in recursive form (32). The block diagram representingp arameter using (31) is in Figure 6. Consider where −1 = (( − 1)/ ) −1 + and −1 = (1/ ) + . As (31) includes (23) in its description, it is necessary to build its recursive form similar to the obtainment of (28); then 6 Computational Intelligence and Neuroscience we have (33) and its block diagram representation shown in Figure 7. Consider Finally, replacing (32) in (17), the identified output is the following equation: iden, =̂.
(34) Figure 8 represents the interaction between the inputs and the resulting error, which has better convergence due to the null error, determined for an instant by the best parameters values. Figure 9 provides the block diagram of a hybrid filter that combines fuzzy inferences with the EANN ARMA model description, instead of the logical block from Figure 8, to determine the adequate matrix parameters. The reference model considered is a BB giving the reference signal . The distribution curve of this signal denotes the intervals where the MF must be; then, the degree of membership obtained by Mamdani with fuzzy inferences has access to the Knowledge Base (KB), determining the parameters of the model, making the convergence, and minimizing the error in a distribution sense.

Results
The performed simulation considers a comparison between both methods giving a better idea of how they approximate to the reference signal. The reference model output considered nonstationary conditions, noise sequences bounded by a distribution function, and, on average, constant mean expected value and variance. The variations in the signal have a periodic signal with smooth random perturbations.
The first part of the simulation considered the hybrid filter, applying inferences obtaining (13) as the signal output identification. Figure 10 shows the fuzzy inference process, where it is possible to identify the functional error given by (14), useful to estimate the coefficients for the ANN. The distribution curves defined the MFs having different operational levels represented through three and seven MFs, corresponding to and , respectively. These MFs are results of associated proper inference mechanisms to select parameterŝand gain through the MFs and the KBs, affecting the final identified output . As an example, Figure 11 presents a three-dimensional KB integrated by sets: gain { }, reference signal { }, and functional error { }. This KB helps us determine gain through the reference and operational error considering our expertise. The KB for̂has a similar structure.
The analytical method uses the block diagram presented in Figure 8 having a delay in execution time due to the time state operations but within fewer process stages due to the fact that it does not require feedback from the functional error.
Our objective was to determine the internal parameters; Figure 12 compares the reference signal parameters to those estimated with both methods. The polar representation allows observing the components of the parameters; where it is possible to see, none of them leaves the unit circle.
When applying the estimation into the hybrid system, the response is as shown in Figure 13, which presents the response following the tendency of the reference.  The analytical method provides the response in Figure 14. The previous graphics, Figures 12-14, were obtained considering a reference system with variable parameters and random noise. In order to better identify how the approximations converge to the reference, we have Figure 15 which presents a graphic segment showing more clearly both approximations to a reference system response, with also variable parameters but without random noise.
From Figure 15, Figure 16 compares the convergence considering the functional error (14) from both methods. In this case, the reference is near to zero as a constant value due to the estimations considered as optimum.

Conclusion
An Equivalent Artificial Neural Net (EANN) was considered describing its parameter through a Black Box (BB) analysis Computational Intelligence and Neuroscience using two different approximations, hybrid and analytical techniques.
For the fuzzy estimation, the best option was to consider the error properties and, in this method, the response signal was adjusted according to the reference. The fuzzy evaluation allowed the description of the coefficients and gain which affect the Kalman filter, improving the identification process according to the Multiple Inputs and Single Output (MISO) model changes with perturbations. The parameter and gain selection, using an intelligent system with classification levels, allowed selecting into the KBs the best coefficients that positively affected the filter evolution. This method does not have an exact approximation, but it is good enough on average, as shown in Figure 12, and in distribution ( Figures  13 and 15) if we consider that its response converged to a particular region different from zero.
The second method used the analytical approximation, converging at almost all points to the system parameters and the reference (Figures 12 and 14) so that the expected results were a minimum functional error through time. We considered that the null error corresponded to the low energy limit, which is not zero in the neurons to avoid the total loss of connection. This method had a better approximation to the reference but achieved the minimum error only in the numerable infinite. Even though this estimation does not consider the error feedback as the first method does, its response continues being adequate when external perturbations affect the system.
A sophisticated ANN could be represented by the integration of more than one EANN due to the fact that its description allows it to consider more than one layer because it has recursive characteristics. In addition to this, the implementation of EANNs gives the possibility of having more control on the number of necessary iterations to reach a reference; this is relevant to systems where restrictions in time delays are considered essential.
Globally, both methods presented good approximations, as shown in Figure 16, with unique characteristics identifying differences between the hybrid and analytical methods.