Feedforward Nonlinear Control Using Neural Gas Network

Nonlinear systems control is a main issue in control theory. Many developed applications suffer from a mathematical foundation not as general as the theory of linear systems.This paper proposes a control strategy of nonlinear systems with unknown dynamics bymeans of a set of local linear models obtained by a supervised neural gas network.The proposed approach takes advantage of the neural gas feature bywhich the algorithm yields a very robust clustering procedure.The directmodel of the plant constitutes a piecewise linear approximation of the nonlinear system and each neuron represents a local linear model for which a linear controller is designed. The neural gas model works as an observer and a controller at the same time. A state feedback control is implemented by estimation of the state variables based on the local transfer function that was provided by the local linear model. The gradient vectors obtained by the supervised neural gas algorithm provide a robust procedure for feedforward nonlinear control, that is, supposing the inexistence of disturbances.


Introduction
Although some physical systems can be approximated as a linear model, almost all real plants actually have a nonlinear functioning.A wide understanding of the behavior of nonlinear processes is available but it is sometimes difficult to choose the appropriate control method.Lyapunov theory is a classic method for nonlinear system control.If and only if there is a positive definite continuous function whose derivative is negative under the suitable conditions of the control design, then the control asymptotic stability is guaranteed.However, this method is unfortunate because obtaining the Lyapunov function is difficult.This problem is even worse when dealing with unknown plants that are not defined mathematically.Therefore, it is usually not easy to guarantee the stability of a complex nonlinear system [1].However, if the local linear system corresponding to an equilibrium point is controllable, then sufficient conditions can be stated for local stability [2].
Hartman-Grobman theorem states that the behavior of a nonlinear system in the neighborhood of an equilibrium point can be approximated by its linearized model.The systems theory is based on many mathematical procedures about stability, controllability, and observability regarding linear systems.The stability and, to a great extent, the dynamic response of a linear system can be described in terms of eigenvalues of the system matrix in state space design or poles of the transfer function.No such method exists for nonlinear systems.For this reason, industrial control processes are still usually designed using this linear control theory.After linearization, the typical approach is to design a linear controller such as PID with fixed parameters.
The classical approach to get local linear models can be achieved with RLS (Recursive Least Squares) method.However, sometimes this method throws up unfavorable results due to the intrinsic nonlinearities of the process to be controlled.The problem is to establish the different operating points for a nonlinear system.At this point, the proposed algorithm can establish each operating point as a cluster centre of the neural gas network.It is for such reason that artificial intelligence techniques improve the control performance.

Complexity
Research into identification and control of nonlinear systems by means of neural networks (NN) began over two decades ago [3].One of the major advantages of control by NN is that precise knowledge of the plant such as a mathematical model is not needed.Initially, control applications using NN were based on a trial-and-error approach.Research efforts have improved the control algorithms and several journals have published special issues with a strong mathematical foundation [4].Many applications are based on a combination between feedforward and recurrent NN.Recurrency, also known as dynamic backpropagation, is necessary due to the dependency of the output on the previous values of the same output, which are also functions of the weights [5].Zhang and Wang [6] proposed a pole assignment control using recurrent NN.
The typical design procedure is to carry out the system identification in order to model the plant and, secondly, to obtain the controller.Traditional methods rely heavily on models extracted from physical principles, whereas approaches based on NN theory usually create black-box models as function approximators using data obtained from the plant.Knowledge about the mathematical model of the plant or any other physical principle is not necessary.
Neural gas (NG) is an unsupervised prototype-based method [7] in which the prototype vectors are the weights and carry out a partition of the training data space.It considers the cooperation-competition computation, allowing the algorithm to be prevented from local minima problems.In addition, the batch NG allows fast training so that the convergence is achieved in a small number of epochs [8].Supervised versions of NG have also been developed, specially for classification [9,10].The algorithm has a great robustness for clustering tasks but has also been proven to be robust to obtain direct models of plants [11].
After years of works in identification and control of dynamical systems by means of NN, there is agreement among researchers that linear identifiers and controllers should be used as first attempt, as stated in Chen and Narendra [2].If a set of local linear models corresponding to several equilibrium points can approximate with certain accuracy a nonlinear system, then linear controllers can be designed for each model and the global control is related to control by switched linear models.
This divide-and-conquer approach is applied in this work.The resulting model is a set of local linear maps.Each neuron of the NG model corresponds to one local model.These local models are obtained after NG training.In this way, a direct model of the plant is obtained.After obtaining this NG model, the design of the local linear controller is simpler than that of the global nonlinear controller.Local linear mapping using another prototype-based algorithm such as SOM was successfully tested at the NASA facilities [12].
This paper aims to apply the robustness modeling capability of NG to control a nonlinear plant such as a typical robot manipulator.
The paper contains the learning rules of the considered NG algorithm in Section 2, the model of the plant and the control strategy are explained in Sections 3 and 4, respectively, and the proposed technique is tested in Section 5.

Neural Gas Approach
The unsupervised version of the NG algorithm is based on energy cost function (1) according to the Euclidean metric.The notation used for the squared Euclidean distance is given in (2).Moreover, A neighborhood function ( 3) is needed to implement the algorithm.The rank function (V,   ) ∈ 0, . . .,  − 1 represents the rank distance between prototype   and data vector V.The minimum distance takes the value 0 and the rank for the maximum distance is equal to  − 1, where  is the number of neurons or prototypes and () is the neighborhood radius: The neighborhood radius () is usually chosen to decrease exponentially according to (4).The decrease goes from an initial positive value,   0 , to a smaller final positive value,   max : where  is the epoch step,  max is the maximum number of epochs, and   0 was chosen as half the number of map units (  0 = /2), as in Arnonkijpanich et al. [13].In addition,   max = 0.0001 in order to minimize the quantization error at the end of the training.The learning rule of the batch version is obtained in Cottrell et al. [8].The batch algorithm can be obtained by means of Newton's method using the Jacobian and Hessian matrices,  and , respectively, of the cost function  NG .The adaptation of the prototype   is formulated accordingly based on this method Kernel function ℎ NG can be considered locally constant [8].In this way, the Jacobian and Hessian matrices are Complexity 3 Substituting ( 6) into (5), the increment can be obtained Finally, the updating rule for each prototype vector appears in 2.1.Supervised Learning.Supervised learning with NG is possible by means of local linear mapping over each Voronoi region defined by prototype vector   .A constant   and a vector ∇  with the same dimension as   are assigned to each neuron .The goal is to approximate the function  = (V) from R  to R, where  is the number of training variables, that is, the dimension of data vector V.The training thus becomes supervised and the dataset contains input-output pairs of data vector V and variable  as the objective function.
The estimation is carried out by where ŷ(V) is the estimated output value,   is the reference value learned for   , ∇  is the gradient of the approximated function obtained in the th Voronoi region defined by   , and  * is the neuron  with its closest   to data vector V, that is, the best matching unit (BMU).The asterisk super index denotes the winning neuron for input data vector V.
The probability distribution of the input data is represented by prototype vectors  which are previously updated according to the typical rule of the unsupervised version of the algorithm [14] using (8).Each prototype vector   can be considered as the centroid of the th Voronoi region.After unsupervised training,  regions are well defined by these vectors.At this point, local models will be created in each of these regions so that  local models will represent the whole data distribution.
The energy cost function of the supervised version of the algorithm is based on the mean square error of the output variable estimation averaged over each Voronoi region [15] according to (10).Prototypes   are already obtained in (8), whereas the adaptation rules of   and ∇  are calculated considering Newton's method for energy cost (10).The learning rules for   and ∇  are shown in (11) and (12), respectively:

Plant Model
After NG training, the plant is modeled as a set of linear systems whose output  depends on the previous values of both output  and input .The Nonlinear Autoregressive-Moving Average (NARMA) model has been proven for nonlinear identification [16,17] and can be expressed as where   is the system output at the sampling instant ,   is the system input at instant , and  is the system delay.Considering zero delay system and substituting ( 13) for (9) remains Hereafter, the gradients will be denoted as coefficients   and   .
And the following terms will be gathered to form variable : Denoting the polynomials with backward shift operator which reminds one of an ARMAX model,  is not only a zero mean independent identically distributed white noise process but also a known disturbance calculated according to (15), and it depends on the input and output values since it is obtained by BMU  * .The internal noise of the system can be included in .

Complexity
Using the -transform, () is the system output and () is the system input where the controller must be connected.

Local Linear Control by State Feedback
If the system is linear (locally), then the superposition theorem can be applied remaining the linear transfer function between the system output and the control input as follows: Define and choose the following relationship between the state variables: Transfer function ( 17) can be expressed in control canonical form for linear state space design as Assuming that the system is controllable, the purpose of the control by state feedback via pole placement is to assign a set of pole locations for the closed-loop system that will correspond to satisfactory dynamic response in terms of rise time, settling time, and overshoot of the transient response.The control law is a linear combination of the state variables   which are estimated in (19) by way of local transfer function (17).
The characteristic polynomial of the closed-loop system depending on system matrix , input matrix , and gain vector  is whereas the characteristic polynomial of the desired pole locations is For an th-order system, the gain vector  = [ 1  2 ⋅ ⋅ ⋅   ] for state feedback is obtained by matching coefficients in (21) and ( 22) forcing the closed-loop poles to be placed at the desired locations: It is possible that there are enough degrees of freedom to choose arbitrarily any desired root location by selecting the proper values   .This is an inexact procedure that may require some iteration by the designer.The solution of the local linear model lies in finding the matrix or the regulator coefficients that implement the state feedback control.The stability condition for linear discrete-time systems is that all the eigenvalues must be inside the unit circle.Obviously, this criterion is not valid for nonlinear systems but there is a region inside the stable linear area where the asymptotic stability of the switched linear systems is achieved [18].Thus, not only a desired dynamic response can be designed, but also stability criteria will be accomplished.In this work, this stability region was found by means of trial-and-error with different eigenvalues.
The proposed control strategy scheme is shown in Figure 1.Gain vector  is calculated to fulfill the dynamics according to (22) depending on the local linear model defined by the current winning neuron  * or BMU.State variables   are also obtained by the local linear model of the NG in (19).Tracking of the setpoint reference is possible using the inverse static gain of the feedback loop.In addition, since disturbance  is known (it is included by the model), it can be compensated as −/  * ( −1 ).The transfer function of the prefilter has been chosen as (1 −  prefilter )  /( −  prefilter )  and determines the switching rate of the local linear models.Although the pole assignment method does not affect the zeros of the plant, the prefilter can be optionally designed in order to cancel dominant zeros located inside the unit circle.

Experimental Testing
The aim is to control the typical robot arm problem depicted in Figure 2. Hagan et al. [19] focused on this plant to be controlled by dynamic propagation algorithm using a Model Reference Control architecture.Obviously the proposal control is not based on the mathematical model of the plant, but this is a well-known second-order nonlinear differential equation: where  = 1 kg,  = 1 kgm  to Hagan et al. [19] where  = 2. Plant input  is a uniformly distributed random signal with −4 and 8.1 as minimum and maximum values of amplitude, respectively.The pulse width must be carefully selected in order to model the transient and steady states correctly.Thus, the pulse width is equal to 14 seconds.Since NG is a vector quantization algorithm where the neurons are updated according to the probability distribution function of the training data, it is important to obtain a uniform distribution of output value  in the training data.After plant simulation, it was observed that the present system output   did not depend on the present system input  49 neurons to obtain the direct model of the plant.Fewer neurons cause a similar effect to that mentioned above for  = 2 and using more neurons does not improve the control.Figures 3 and 4 show the results for testing data after training for  = 4. Obviously, the sampling time must meet the criteria  of Nyquist-Shannon theorem.The considered sampling time was 0.1 seconds.
Eigenvalues   in (22) provide asymptotic stability and suitable dynamic response.We consider that all the eigenvalues are equal; that is, () = ( − ) 4 .In order to tune , the control system was tested for different eigenvalues and prefilter poles.Steps of amplitude 0.9 at the reference setpoint were used since  values close to 0.9 represent the worst system working zone with high nonlinearity.Obviously,  values over 0.9 are even worse, but we refer to the training data range.Eigenvalue  was incremented from 0.2 to 0.9 in series of 0.1 amplitude steps.Figures 5 and 6 show the results.If the prefilter has a wide bandwidth ( prefilter is low), then lower values of  produce instability.In Figure 5, lower  values yield a considerable effect of the modeled disturbance  and the overshoot is high, whereas there are some rebounds between two BMU linear models for higher  values.The lower , the lower the rise time (the wider the bandwidth).The optimum  range is [0.5-0.6] for  prefilter = 0.4.Thus, when using switched linear models there is a stability region of  within the global stability area of the linear systems theory (unit circle) depending on the switching rate of the NG local linear models [18].Here the switching rate is determined by the prefilter transfer function.If  prefilter is increased then good results are obtained for  = 0.2, and see Figure 6.
Once the parameters had been determined, the NG approach was tested to track a constant reference for control of position in Figure 7 and a variable reference such as a sinusoidal signal to check control of velocity in Figure 8.The linear estimation output θ is not the NG estimation θ in (9) but it is calculated by means of (20) and adding the value of known disturbance .The worst control is when the reference value is close to 0.9.To illustrate this problem, PI control with fixed parameters was compared to NG approach.Obviously, PID control would achieve a fast and stable response due to derivative action but at the cost of an unrealizable control action and the system control would be affected by noisy signals.The linearized model using the -transform in the neighborhood of  0 , denoted as |  0 , is A suitable PI design regarding settling time and smoothness response is PI() =   ⋅ ( + (  /  ))/ = 0.3 ⋅ ( + 10)/ considering  0 = 0.5.Figure 9 shows that the system controlled by this PI with fixed parameters becomes more oscillatory when considering  0 = 0.9 because the two complex poles of the plant change their location involving the change of the closed-loop poles.The two complex closed-loop poles are more dominant and, therefore, the system increases oscillation.In this design the trade-off is between the settling time and the oscillatory component of the response.The tuning is to change   keeping constant the location of the zero in −10; that is,  i /  = 10. Figure 10 shows the influence of the adjustment of   in the control design and the NG approach is displayed to be compared to that one.
The control strategy described above is valid for feedforward control, that is, supposing the inexistence of disturbances.In these conditions, the algorithm promises very good performance.However, disturbance rejection can be achieved adding an extra state variable so that ẋ () = (), where  is the tracking error, and following the steps described from (21) to (23), Figure 11 shows the rejection of a constant output disturbance of amplitude 0.05.

Conclusions
In this paper a supervised version of neural gas (NG) algorithm is proposed to control nonlinear systems whose dynamic mathematical models are unknown.The identification of the plant is achieved with the NG model.In comparison to other types of neural networks, the formation of the NG model is a robust procedure since there are neither problems of local minima nor overfitting.The training data must be carefully selected in order to model the transient and steady states correctly.The NG algorithm tends to model the steady states quite well.Obviously, the transient must be correctly modeled in order to control the plant.In this way, the number of delayed samples  and the number of neurons  are key parameters.There must be a sufficient number of neurons but the control is not improved if it is too large.
The trained NG network produces a set of piece-wise local linear models.Each of these is represented by a neuron.
The global controller is a set of linear controllers which are obtained by state feedback via pole assignment.This control does not affect zeros but if these are inside the unit circle, then they can be cancelled by the poles of the prefilter.
Eigenvalues  inside the unit circle do not guarantee the asymptotic stability because the plant to be controlled is nonlinear.Therefore, the stability corresponds to a region inside the unit circle [18].This set of  values was assigned by means of trial-and-error.The worst performance occurs for the highest setpoint values where the nonlinearities arise.The proposed approach provides a smoother and faster response than the typical PI with fixed parameters.
To conclude, NG algorithm provides a robust procedure not only for clustering tasks, but also for feedforward nonlinear control using the gradient vectors obtained by the supervised version.These gradient vectors constitute the poles and zeros of the local transfer function of the plant.The computational complexity is linear regarding the number of samples, neurons, and variables because of the efficient implementation in batch procedure.Step response with  prefilter = 0.7 Step response without prefilter

Figure 1 :
Figure 1: Control strategy by state feedback and local linear models.

Figure 2 :
Figure 2: Plant used to test the proposed control.

2 ,
= 1 m,  = 1 kgm 2 /s, and  = 10 m/s 2 .The viscous friction coefficient is important regarding stability and dynamic response.If  = 0, then the system is unstable in open loop and the necessary training data can not be obtained by open loop simulation of the plant.In the present work, the system is simulated in open loop to acquire the training data considering  = 1 in order to propose a plant with more oscillatory response in comparison

U
Controlled output Eigenvalue Control action U Linear estimation output θL

Figure 9 :
Figure 9: PI control at two different linearization points.

Figure 10 :Figure 11 :
Figure 10: NG in comparison to PI.Step response and output disturbance rejection