AnOutput-Recurrent-Neural-Network-Based Iterative Learning Control for Unknown Nonlinear Dynamic Plants

We present a design method for iterative learning control system by using an output recurrent neural network (ORNN). Two ORNNs are employed to design the learning control structure. The first ORNN, which is called the output recurrent neural controller (ORNC), is used as an iterative learning controller to achieve the learning control objective. To guarantee the convergence of learning error, some information of plant sensitivity is required to design a suitable adaptive law for the ORNC. Hence, a second ORNN, which is called the output recurrent neural identifier (ORNI), is used as an identifier to provide the required information. All the weights of ORNC and ORNI will be tuned during the control iteration and identification process, respectively, in order to achieve a desired learning performance. The adaptive laws for the weights of ORNC and ORNI and the analysis of learning performances are determined via a Lyapunov like analysis. It is shown that the identification error will asymptotically converge to zero and repetitive output tracking error will asymptotically converge to zero except the initial resetting error.


Introduction
Iterative learning control (ILC) system has become one of the most effective control strategies in dealing with repeated tracking control of nonlinear plants.The ILC system improves the control performance by a self-tuning process in the traditional PID-type ILC algorithms for linear plants or affine nonlinear plants with nonlinearities satisfying global Lipschitz continuous condition [1][2][3].Recently, the ILC strategies combined with other control methodologies such as observer-based iterative learning control [4], adaptive iterative learning control [5], robust iterative learning control [6], or adaptive robust iterative learning control [7], have been widely studied in order to extend the applications to more general class of nonlinear systems.However, more and more restrictions are required in theory to develop these learning controllers.Among these ILC algorithms, PIDtype ILC algorithms are still attractive to engineers since they are simple and effective for real implementations and industry applications.A main problem of the PID-type ILC algorithms is that a sufficient condition required to guarantee learning stability and convergence will depend on plant's input/output coupling function (matrix).In general, it is hard to design the learning gain if the nonlinear dynamic plant is highly nonlinear and unknown.In order to get the input/output coupling function (matrix), the ILC using a neural or fuzzy system to solve the learning gain implementation problem can be found in [8,9].A neural network or a fuzzy system was used to approximate the inverse of plant's input/output coupling function (matrix).The inverse function (matrix) is claimed to be an optimal choice of the learning gain from a convergent condition point of view.As the nonlinear system is assumed to be unknown, some offline adaptive mechanisms are applied to update the network parameters in order to approximate the ideal optimal learning gain.
Actually, for control of unknown nonlinear systems, neural-network-based controller has become an important strategy in the past two decades.Multilayer neural networks, recurrent neural networks and dynamic neural network [10][11][12][13][14][15][16] were used for the design of adaptive controllers.On the other hand, fuzzy logic system, fuzzy neural network, recurrent fuzzy neural networks and dynamic fuzzy neural network were also a popular tool for the design of adaptive controllers [17][18][19][20][21][22].These concepts have also been applied to the design of adaptive iterative learning control of nonlinear plants [23][24][25].However, few ILC works were developed for general unknown nonlinear plants, especially nonaffine nonlinear plants.As the authors can understand, a realtime recurrent network (RTRN) was developed in [26] for real-time learning control of general unknown nonlinear plants.But unfortunately, their learning algorithm depends on the generalized inverse of weight matrix in the RTRN.If the generalized inverse of weight matrix does not exist, the learning control scheme is not implementable.
In this paper, we consider the design of an iterative learning controller for a class of unknown nonlinear dynamic plants.Motivated by our previous work in [27], an improved version of an identifier-based iterative learning controller is proposed by using an output recurrent neural network (ORNN).Two ORNNs are used to design an ORNN-based iterative learning control system.The proposed ORNNbased ILC system includes an ORNN controller (ORNC) and an ORNN identifier (ORNI).The ORNC is used as an iterative learning controller to achieve the repetitive tracking control objective.The weights of ORNC are tuned via adaptive laws determined by a Lyapunov-like analysis.In order to realize the adaptive laws and guarantee the convergence of learning error, some information of the unknown plant sensitivity is required for the design of adaptive laws.Hence, the ORNI is then applied as an identifier to provide the required information from plant sensitivity.In a similar way, the weights of ORNI are tuned via some adaptive laws determined by a Lyapunov-like analysis.Both of the proposed ORNC and ORNI update their network weights along the control iteration and identification process, respectively.This ORNN-based ILC system can be used to execute a repetitive control task of a general nonlinear plant.It is shown that the identification error will asymptotically converge to zero and repetitive output tracking error will asymptotically converge to zero except the initial resetting error.
This paper is organized as follows.The structure of ORNN is introduced in Section 2. In Section 3, we present the design of ORNC and ORNI for the ORNN-based ILC system.The adaptation laws are derived and the learning performance is guaranteed based on a Lyapunov-like analysis.To illustrate the effectiveness of the proposed ILC system, a numerical example is used in Section 4 for computer simulation.Finally a conclusion is made in Section 5.
In the subsequent discussions, the following notations will be used in all the sections.
(i) |z| denotes the absolute value of a function z.
x 1 (t) x n (t) Figure 1: Structure of the ORNN.

The Output Recurrent Neural Network
In this paper, two ORNNs are used to design an iterative learning control system.The structure of the ORNN is shown in Figure 1, which comprises an input layer, a hidden layer, and an output layer.
(i) Layer 1 (Input Layer): Each node in this layer represents an input variable, which only transmits input value to the next layer directly.For the ith input node, i = 1, . . ., n + 1, net (1) where x i , i = 1, . . ., n represents the ith external input signal to the ith node of layer 1, and D[O (3) ] denotes the delay of ORNN output O (3) whcih can be further defined as x n+1 = D[O (3) ].
(ii) Layer 2 (Hidden Layer): Each node in this layer performs an activation function whose inputs come from input layer.For the th hidden node, a sigmoid function is adopted here such that the th node, = 1, . . ., M will be represented as net (2) where x (2)   i = O (1)  i , V i is the connective weight between the input layer and the hidden layer, M is the number of neuron in the hidden layer.
(iii) Layer 3 (Output Layer): Each node in this layer represents an output node, which computes the overall output as the summation of all input signals from hidden layer.For the output node, 3) , where x (3) = O (2) and w is the connective weight between the hidden layer and the output layer.
Let n denotes the dimension of input vector X = [x 1 , . . ., x n ] ∈ R n×1 of nonlinear function f (X) and M denotes the number of neurons in the hidden layer, the ORNN which performs as an approximator of the nonlinear function f (X) is now described in a matrix form as follows: where W ∈ R M×1 and V ∈ R (n+1)×M are outputhidden wight matrix and hidden-input weight matrix, respectively, X ∈ R n×1 is the external input vector, X a ≡ [X , D[O (3) ]] ∈ R (n+1)×1 is the augmented neural input vector, and D[O (3) ] denotes the delay of ORNN output O (3) .The activation function vector is defined as being the th column vector, and

Design of Output-Recurrent-Neural-Network-Based Iterative Learning Control System
In this paper, we consider an unknown nonlinear dynamic plant which can perform a given task repeatedly over a finite time sequence t = {0, . . ., N} as follows: where j ∈ Z + denotes the index of control iteration number and t = {0, . . ., N} denotes the time index.The signals y j (t) and u j (t) ∈ R are the system output and input, respectively.f : R n+1 → R is the unknown continuous function, n represents the respective output delay order.Given a specified desired trajectory y d (t), t ∈ {0, . . ., N}, the control objective is to design an output-recurrent-neuralnetwork-based iterative learning control system such that when control iteration number j is large enough, |y d (t) − y j (t)| will converge to some small positive error tolerance bounds for all t ∈ {0, . . ., N} even if there exists an initial resetting error.Here the initial resetting error means that y d (0) / = y j (0) for all j ≥ 1.To achieve the control objective, an iterative learning control system based on ORNN design is proposed in Figure 2. In this figure, D denotes the delay in time domain and M denotes the memory in control iteration domain.

Dynamic plant
Reference model + − u j (t) y j (t + 1) y j (t) Before we state the design steps of the proposed control structure, some assumptions on the unknown nonlinear system and desired trajectories are given as follows.
(A1) The nonlinear dynamic plant is a relaxed system whose input u j (t) and output y j (t) are related by The design of the ORNN-based iterative learning control system is divided into two parts.

Part 1: Design of ORNC and Corresponding Adaptive
Laws.Based on the assumptions on the nonlinear plant (5), we define a tracking error e j c (t) at jth control iteration as follows: It is noted that there exist bounded constants ε j c , j ∈ Z + such that the initial value of e j c (t) will satisfy The difference of e j c (t) between two successive iterations can be computed as [28] Δe The ORNN is used to design an ORNC in order to achieve the iterative learning control objective.Let n c be the dimension of the external input vector X j c (t) = [r(t), y j (t), u j−1 (t), ] ∈ R nc×1 and M c denote the number of neurons in hidden layer of the ORNC.The ORNC which performs as an iterative learning controller is described in a matrix form as follows: where )×Mc are outputhidden wight matrix and hidden-input weight matrix to be tuned via some suitable adaptive laws, respectively, and ×1 is the augmented neural input vector.For the sake of convenience, we define O (2)  c (V c j (t).Now substituting ( 9) into ( 8), we will have For simplicity, we define ΔX After adding and subtracting W j c (t) O (2)   c j+1 (t) to (10), we can find that Investigating the second term in the right hand side of ( 11) by using the mean-value theorem, we have where Now if we substitute ( 12) into (11), we will have The adaptation algorithms for weights W j+1 c (t) and V j+1 c (t) of ORNC at (next) j + 1th control iteration to guarantee the error convergence are given as follows: where y u,max (t) is defined in assumption (A2).If we substitute adaptation laws ( 14) and ( 15) into ( 13), we can find that Theorem 1.Consider the nonlinear plant (5) which satisfies assumptions (A1)-(A3).The proposed ORNC (9) and adaptation laws ( 14) and (15) will ensure the asymptotic convergence of tracking error as control iteration approaches infinity.
Proof.Let us choose a discrete-type Lyapunov function as then the change of Lyapunov function is Taking norms on (16), it yields for iteration j ≥ 1.This further implies that E j c (t + 1) > 0, ΔE j c (t + 1) < 0, for all t ∈ {0, . . ., N} for j ≥ 1.Using Lyapunov stability of E j c (t + 1) > 0, ΔE j c (t + 1) < 0 and (7), the tracking error e j c (t) will satisfy lim This proves Theorem 1.
Remark 2. If the plant sensitivity y j u(t) is completely known so that sgn(y j u(t)) and y u,max (t) are available, then the control objective can be achieved by using the adaptation algorithms ( 14) and (15).However, the plant sensitivity y j u(t) is in general unknown or only partially known.In part 2, we will design an ORNN-based identifier (ORNI) to estimate the unknown plant sensitivity y j u(t) and then provide the sign function and upper bounding function of y j u(t) for adaptation algorithms of ORNC.

Part 2: Design of ORNI and Corresponding Adaptive
Laws.After each control iteration, the ORNI subsequently begins to perform identification process.The trained ORNI will then provide the approximated plant sensitivity to the ORNC to start the next control iteration.We would like to emphasize that the ORNI only identifies the nonlinear plant after each control iteration.This concept is quite different from traditional control tasks [29] and very important to the proposed ORNN-based ILC structure.
The structure of ORNN is further applied to design an ORNI to identify the nonlinear plant after the jth control iteration.The identification process is stated as follows.After each trial of controlling the nonlinear system, we collect the input output data u j (t) and y j (t), t = 0, 1, . . ., N + 1 as the training data for the identifier.When discussing the identification, we omit the control iteration index j and introduce a new identification iteration index k ∈ Z + to represent the number of identification process.That is, the notation for the training data u j (t), y j (t) and the ORNI output y j,k (t) are simplified as u(t), y(t), and y k (t), respectively.For the ORNI, let n I be the dimension of external input vector X I (t) = [u(t), y(t)] ∈ R nI ×1 and M I denote the number of neurons in hidden layer of the ORNI.The ORNI which performs as an iterative learning identifier for nonlinear plant ( 5) is now described in a matrix form as follows: where )×MI are outputhidden wight matrix and hidden-input weight matrix to be tuned via some suitable adaptive laws, respectively, and is the augmented neural input vector.For the sake of convenience, we define Based on the assumptions on the nonlinear plant (5), we define an identification error e k I (t) at kth identification process as follows: The difference of e k I (t) between two successive identification process can be computed as Now substituting ( 21) into ( 23), we will have For simplicity, we define . After adding and subtracting W k I (t) O (2)   I k+1 (t) to (24), we can find Investigating the second term in the right hand side of ( 25) by using the mean-value theorem, we can derive where (2)   I, k (Z I, (t))/dZ I, (t)| ZI, (t) , Z I, (t) has a value between V k+1 I, (t) X k+1 Ia (t) and V j I, (t) X k Ia (t), = 1, . . ., M I .Now if we substitute (26) into (25), we will have The adaptation algorithms for weights W k+1 I (t) and V k+1 I (t) of ORNI at (next) k + 1th identification process are given as follows: If we substitute adaptation laws (28) into ( 27), we have Theorem 3. Consider the nonlinear dynamic plant (5) which satisfies assumptions (A1)-(A3).The proposed ORNI (21) and adaptation laws (28) will ensure that the asymptotic convergence of identification error is guaranteed as the numbers of identification approach infinity.
Proof.Let us choose a discrete-type Lyapunov function as then we can derive the change of Lyapunov function as Taking norms on (29), we have for iteration k ≥ 1.This implies that E k I (t + 1) > 0, ΔE k I (t + 1) < 0, for all t ∈ {0, . . ., N} for k ≥ 1, and hence the identification error e k I (t) will satisfy lim k → ∞ |e k I (t)| = 0, for all t ∈ {0, 1, . . ., N}.This proves Theorem 3.
Remark 4. The ORNN is a promising tool for identification because it can approximate any "well-behaved" nonlinear function to any desired accuracy.This good function approximation is applied to estimate the unknown plant sensitivity in this paper.The plant sensitivity y j u(t) in ( 8) can be approximated as follows: Note that the index k in the identifier output y k (t) is removed once the identification process stops.Applying the chain rule to (21), it yields Also from ( 21), we have Since the inputs to ORNI are u j (t), y j (t) and D[O (3)   I j (t)], we further have net (2) (36) Thus, ∂net (2) From (34), ( 35) and (37), we obtain where 0 < f (2)  I, (net (2)   I, j (t)) < 0.  ( The sign function and upper bounding function of plant sensitivity after finishing the identification process at jth control iteration can be obtained as follows: (40) It is noted that we do not need the exact plant sensitivity y j u(t) for the design of adaptive law (14).Even though there may exist certain approximation error between y j u(t) and y j u(t), we can still guarantee the convergence of learning error since only a upper bounding function is required.Also note that the value of sgn(y j u(t)) (+1 or −1) can be easily determined from the identification result.

Simulation Example
In this section, we use the proposed ORNN-based ILC to iteratively control an unknown non-BIBO nonlinear dynamic plant [26,29].The difference equation of the nonlinear dynamic plant is given as y j (t + 1) = 0.2 y j (t) 2 + 0.2y j (t − 1) + 0.4 sin 0.5 y j (t − 1) + y j (t) × cos 0.5 y j (t − 1) + y j (t) where y j (t) is the system output, u j (t) is the control input.The reference model is chosen as where r(t) = sin(2πt/25) + sin(2πt/10) is a bounded reference input.The control objective is to force y j (t) to track the desired trajectory y d (t) as close as possible over a finite time interval t ∈ {1, . . ., 200} except the initial point.The network weight adaptation for the ORNI and ORNC is designed according to ( 14), (15), and (28), respectively.
In the ORNC, we set W j c (t) ∈ R 2×1 and V j c (t) ∈ R 4×2 , that is, only two hidden nodes in layer 2 are used to construct the ORNC.In a similar way, we let W k I (t) ∈ R 2×1 and V k I (t) ∈ R 3×2 , that is, only two hidden nodes in layer 2 are used to set up the ORNI.For simplicity, all the initial conditions of ORNC parameters are set to be 0 at the first control iteration.In addition, the initial ORNI parameters are set to be 0 at the first identification process which begins after the first control iteration.We assume that the plant initial condition satisfies y j (0) = 2 + randn where randn is a generator of random number with normal distribution, mean = 0 and variance = 1.To study the effects of learning performances, we first show the maximum value of tracking error |e j c (t)|, t ∈ {1, . . ., 200} with respect to control iteration j in Figure 3(a).It is noted that |e j c (0)| is omitted in calculating the maximum value of tracking error since it is not controllable.The identification error at 10th control iteration |e 10,k I (t)| with respect to identification process k is shown in Figure 3(b).According to the simulation results, it is clear that the asymptotic convergence proved in Theorems 1 and 3 is achieved.Since a reasonable tracking performance is almost observed at the 10th control iteration, the trajectories between the desired output y d (t) and plant output y 10 (t) at the 10th control iteration are shown to demonstrate the control performance in Figure 3(c).Figure 3(d) shows the comparison between the identification result of y 10 (t) and the plant output y 10 (t).The nice identification result enables the ORNI to provide the required information for the design of ORNC.Finally, the bounded control input u 10 (t) is plotted in Figure 3(e).

Conclusion
For controlling a repeatable nonaffine nonlinear dynamic plant, we propose an output-recurrent-neural-networkbased iterative learning control system in this paper.The control structure consists of an ORNC used as an iterative learning controller and an ORNI used as an identifier.
The ORNC is the main controller utilized to achieve the repetitive control task.The ORNI is an auxiliary component utilized to provide some useful information from plant sensitivity for the design of ORNC's adaptive laws.All the network weights of ORNC and ORNI will be tuned during the control iteration and identification process so that no prior plant knowledge is required.The adaptive laws for the weights of ORNC and ORNI and the analysis of learning performances are determined via a Lyapunov-like analysis.We show that if the ORNI can provide the knowledge of plant sensitivity for ORNC, then output tracking error will asymptotically converge to zero except an initial resetting error.We also show that the objective of identification can be achieved by the ORNI if the number of identifications is large enough.

Figure 2 :
Figure 2: Block diagram of the ORNN-based ILC system.