NEURAL NETWORKS WITH MEMORY

ABSTILkCT This paper is divided into four parts. Part 1 contains a survey of three neural networks found in the literature and which motivate this work. In Part 2 we model a neural network with a very general integral form of memory, prove a boundedness result, and obtain a first result on asymptotic stability of equilibrium points. The system is very general and we do not solve the stability problem. In the third section we show that the neural networks are very robust. The fourth section concerns simplification of the systems from the second part. Several asymptotic stability results are obtained for the simplified systems.


,
T.A. BURTON input capacitance C of the cell membranes, the transmembrane resistance R, and the finite impedance Ti" 1 between the output Vj and the cell body of cell i.These time lags are ignored by Hopfield, but are of fundamental interest to us for this work.
ttis system is Ci(dui/dt) = TijVj ui/R + I i, u = g[" (Vi).(1)   All of these functions are evaluated at time I. ere, the input-output relation V gi(ui) is a sigmoidal function with gi(O)= 0, g()> 0, gi(ui)l ui. (We have written (d/d)(gi()) = g() add we will sometimes denote = should clearly indicate the meaning.)The quantity Tjgj(uj) represents the electrical current input to cell i due to the present potential of cell j, and Tij is the synapse efficacy.He sumes linear summing.Both + and signs can occur in Tij.The constant I represents any other fed input to neuron i; Hopfield, well other investigators, frequently takes I to be zero.
Unless a neuron is self connected, Tii = 0; Marcus and Westervelt [12], well others, treat systems with self connections.If there are no self connections, then it is impossible for the matrix (Tj) to be positive or negative definite, condition we later require.But we later point out that, mathematically, we may always regard a neuron being self connected owing to its own current potential.Hopfield [6; p. 3089] regards definiteness of (Tj) patholocal.
If neuron is connected to neuron j, then neuron j is connected to neuron and it is frequently sumed, particularly by Hopfield, that T ij = T ji; but this is not always done and Hopfield discusses experiments in which that condition does not hold.There may even be skew symmetric ces [6; p.3090].
An electrical simulation model is also given by Hopfield [6; p. 3089].
The derivation of (1) by Hopfield is cle, soundly motivated, and highly interesting, but perhaps his most interesting contribution is the construction of a Lyapunov function V whose derivative along a solution of (1) satisfies (for Tij = Tji Or (here again g'= dg/dui, while u[ = dui/dt ), a useful and remarkable relation.At this point ttopfield [6; p. 3090] states (but does not prove) that E is bounded along a solution, so every solution converges to a point at which dui/dt = 0 for every i.It is to be noted that E is not radially unbounded, so an independent proof of boundedness of solutions must be given.Such a proof is simple and will later be supplied.Now E is a very interesting Lyapunov function.A later calculation shows that OElOu = Ciug(ui) so that ( 1) is actually Ciu = -(OE/Oui)/g(ui) (I)* and that brings us to the second model of interest.Equation ( 1)* is almost a gradient system.This can also be inferred from the work of Cohen and Grossberg [2] which deals with a more general system than (1) and which they note [2; p. 818] is related to a gradient system.
Han, Sayeh, and Zhang [4] go a step further (as do many other investigators) and design a neural network based on an exact gradient system X' = gradV(X), "= d/dt, where V(X) is a given Lyapunov function having minima at certain fLxed points, called stored vectors, and OV/Oz is continuous.Then the derivative of V along (4) is dV/dt = grad V(X) grad V(X)) = Ii grad V(X) il where !1.I I the Euclidian length.
They also consider a concrete example of (4) in the form = 1,...,N, and where z is the ith component of X,a: is the ith component of the sth stored vector, and M is the number of stored vectors.It is claimed that all solutions converge to the minima of V.
Questions of delay are not considered, but (4) is very well adapted to delays, as we will see.

T.A. BURTON
The third model on which we wish to focus is that of Marcus and Westervelt ([11], [12]) who start with a streamlined version of (1) into which they introduce a delay and write N = + (8) j=l where f has a maximum slope of/ at zero, f is sigmoidal, < > 0 positive constant.The authors give a linear stability analysis of (8) both for r > 0 and r = 0, concluding that there are sustained oscillations in some cases.A nonlinear stability analysis is also given which yields a critical value of r at which oscillations cease.
But in actual neural networks of both biological and electrical type, the response tends to be based on an accumulation of charges (Hopfield's "short term average"), say through a capacitor, and the result is a delay term in the form of an integral, not a pointwise delay.
Indeed, if a Stieltjes integral is used, then the integral can represent various pointwise delays, as is noted by Langenhop [9], for example.Our work here concentrates on integral delays.
Remarks on literature.We have focused on the two papers of Hopfield, the paper by Han, Sayeh, and Zhang [4], and the papers of Marcus and Westervelt ([11], [12]) because they provide central motivation for work.But the literature concerning the Hopfield model is enormous.Miller [13] has written a two volume loose-leaf survey work on neural networks with exhaustive bibliography and survey of investigators.

BOUNDEDNESS, STABILITY, AND DELAYS
Let us return now to the derivation of (1).Elementary circuit theory states that when I(t) is the current, then the charge on the capacitor is given by f I(s)ds.ttag [3; p. 169] 0 discusses this process with the capacitor discharging when the charge reaches a certain level.
For the neural network (1) the effect of the capacitor can be modeled by replacing Tijgj(uj) by t aij(t-s)gj(uj(s))ds (which can also be written as f aij(s)gj(uj(t-s))ds where 0 form for aij(t ) would be aij(t f aij(s)ds Tij/Ci.0 3Tij(t h)2/h3C 0 for t>h.
Thus, our model is 3 (io) where for some M > 0, ai:(t is piecewise continuous. Obviously, C is not taken to be the capacitance in this system.It should be noted that for proper choice of aij(t), ( 9) can represent terms f gj(uj(s))ds and gj(uj(t-h)) at the same t-h Otot time (cf.aangenhop [9]).Moreover, if aij( ) = Tie '/Ci, c constant, then (10) can be reduced to a higher dimensional system of ordinary differential equations.This idea is developed in Section 4.
It is readily proved that for each set of bounded and piecewise continuous initial functions ui(s) on -cx < s _< 0, there is a solution ui(t on some interval 0 < t < a; and if the solution remains bounded, then c = cx (see [1] for methods of proof).
It is to be noted that if u = (u,..., Un)is an equilibrium point for (1), then it is also for (10).Hopfield [6;p.3089] has made a careful study of those equilibrium points.Our long term goal is to show that solutions of (10) approach the equilibrium points of (1).To that end, we follow the lead of Hopfield [6; p. 3090] where he constructs the Lyapunov function E given in (2).We will try to extend that Lyapunov function to (10).
Before doing so we first focus on Hopfield's argument [6; p. 3090].He states that E is bounded, that dE/dr < O, and that dE/dr = 0 implies that dVi/dt = 0 for all so that all solutions approach points where dE/dt = 0. His conclusion is most certainly correct, but he needs to first show that solutions are bounded; this is an easy matter, as we shall see.
Basically, Hopfield is invoking an old result of Yoshizawa ([14] or (1; p. 232]) or, as the system is autonomous, a result of Krasovskii [8;p.67] which may be stated as follows.
Theorem (Yoshizawa)-Let F:[O, cx))xRn---,R n be continuous and bounded for z bounded and suppose that all solutions of z'= F(t,z) are bounded.If there is a continuous function E:[0,cx)Rn--,(-cx,cx)) which is locally Lipschitz in and bounded below for : bounded, if there is a continuous function W: Rn---,[O, cx:) which is positive definite with respect to a closed set f, and if E' <_ -W(z), then eve solution approaches f as to.
The crucial requirement is that solutions be bounded, not that E be bounded (except x 2 x 2 for bounded), as the following example shows.Let z'= ze and E = e so that E' = 2x2e-2x2; E is bounded, but all nontrivial solutions tend to 4, cx.Of course, this does not happen in the Hopfield case.
The following proof applies equally well to ttopfield's equation and to that of Marcus and Westervelt, but it does not apply to the Marcus and Westervelt linearized system.The type of boundedness proved here is commonly called uniform boundedness and uniform ultimate boundedness for bound S at t = 0 (cf.[1; p.248]).
I, emma l" There is a B > O, and for each B 1> 0 there is a B 2>0 and a T > O such that if the initial functions all satisfy (s) <_ Bx on (-c,0], then the solutions of (10) will satisfy u(t) _< B2 for all t > O, while u(t) <_ n if t > T. Proof: Since the gi(ui) < 1, the I are constants, and f laij(t) ldt <_ M, the o solution ui(t) satisfies where h(t) < M + I and I = mazlI/Cil.Certainly, h(t) depends on the initial function, but M does not.Thus, by the variation of parameters formula, t ,i [ ,i(t ,) u(t)! _< lu(0) le / M e ds, 0 from which the result follows.
System (1) seems to us to be precisely the one which describes the ttopfield problem and is worthy of careful study.It is, however, quite nontrivial and may be the focus of stability analysis for some time to come.We begin by showing that a study is feasible by giving a basic result patterned after a one dimensional theorem of Levin [10] concerning an unrelated question.In this result, our initial functions are points in R n at t = 0, but are zero for t < 0. Such are also Hopfield's initial conditions.The initial functions have the effect of changing (10) to + + " 0 While we stated earlier that (10) can include the Marcus and Westervelt system, that is not true under the conditions of the following result.
Remark: Theorem 1 is viewed as a first result.Nevertheless, the definiteness conditions on (aij(t)) (aj(t)) and (a(t)) may not be as severe as they first seem.These require self connections.Since -ui/R appears in (1) we can think of each neuron as being self connected.To see this, in (1) determine Ti* such that [ui/Ri]-Ti*igi(ui) has the sign of ui, so that (1) can be written as Ci(duildt = ijVj-([ui/Ri]-Tigi(ui))+ I (1)** 3 where ij = Ti if i j, and ii = Tii-Ti" Then design the delay system so that (10) becomes and ( 13) becomes (13)* since [ui/Ri]-Ti*igi(ui) and gi(ui) both have the sign of u i.The matrices (aij(t)) (aj(t)) and (a(t)) will have nonzero diagonal elements.
In Section 4 we will simplify (10) and obtain results independent of the definiteness of these matrices.

ROBUSTNESS AND DELAYS
Equations (1)* and (5) show that (1)and ( 5) are very robust in the sense that comparatively large perturbations can be added and solutions will still converge to the equilibrium points of the unperturbed equation.Recently, Kosko [7] has discussed robustness of this type for a variety of neural networks when the perturbations are stochastic.
Lyapunov's direct method is well suited to proving robustness under real perturbations.Intuitively we have the following situation.Given a positive definite Lyapunov function V(u) for a differential equation u' = F(u), the derivative of V along a solution is V' = grad V. F grad V Flos O where 0 is the angle between the tangent vector F(u) to the solution and grad V which is the outward normal to the surface V = constant.A gradient system has cos 0--1, the optimal value.This means that the solution u(t) enters the region V(u)<_ constant along the inward normal.Hence, if we perturb the differential equation to u'= F(u)+ G(u), so long as G(u) is not too large relative to F(u), the vector F(u)+G(u) will still point inside the region V(u) <_ constant.Now ( 5) is actually a gradient system so the perturbation result for it is better than the one for (1)* which is merely almost a gradient system.Perturbation results are crucial for any real system since the mathematical equation will seldom represent the physical reality exactly.
Let fl and h be positive constants and A an n n matrix of piecewise continuous functions with [[ A ][ < 1 where I I A ]] = ma [aij(t)[ and consider l<<n ot <h t = grad V(X)+ / A(t-s) grad V(X(s))ds.

X' t-h (14)
Several other forms could be chosen, but this will demonstrate the strong stability.Note that (4) and ( 14) have the same equilibrium points (under our subsequent assumption ( 16)).To solve (14) it is required that there be given a piecewise continuous initial function 9:[-h,O]''*Rn.There is then a continuous solution X(t,o) on some interval 0 _< t < c with X(t,)--(t) for -h _< t _< 0; X(t,) satisfies ( 14) on (0,c).See methods of [1] for existence details.

Proof:
Define a Lyapunov functional along a solution of ( 14) by so that 0 t Then there is a # > 0 with W'(t) <_ #( grad v(x) + / A(t sDrad V(X(sl) ds).
It is known that the only way in which a solution X(t) of ( 14) can fail to be defined for all t > 0 is for there to exist a T >0 such that lim sup IX(t) = +x.Thus, if T-1 X'(s) ds?
Suppose that X(t, 9) is bounded.Then grad V(X) is continuous and A(t) is bounded so X'(t,o) is bounded.The argument of Yoshizawa [14] is fully applicable and X(t,) approaches the set in which grad V(X) = 0, the equilibrium points of (4).This completes the proof.There are several simple conditions which will ensure that solutions of ( 14) are bounded.Certainly, (17) with W >_ 0 will not do it as may be seen from the scalar equation x 2 x 2 V 2x 2 z' 2:e z2 with V = e We have grad V = -2:e and = 4z2e = -(grad V(z))2; but all solutions except z = 0 are unbounded.
We could ensure boundedness by asking one of the following: Since W'(t) <_ O, if for each continuous : h, 0]n n and for C = {:[-h, 0]---*Rn}, we have lira infW((t)) > W()(0), then all solutions II II-oo of ( 14) are bounded.e c (b) If there is a continuous function G:[0,oo)---,[0,oo) with G(r)= in/ [grad V(X) and f G(r)dr = oo, then all solutions of ( 14) are bounded.The validity of (a) should be clear.To prove (b) we note that there is a k > 0 with W'(t) ---X'(t) grad V(X()) -- so that o < w(t) < w(o)-k a(IX(s) l)lX'(s)lds 0 t < w(0)-/ a(I x(s) I) x(s) I' ds Remark: The conclusion of this theorem can not be strengthened to stating that bounded solutions approach the minima of V(X), as was desired in [4] where maxima and saddle points were to be avoided.In the scalar equation with the minimum is at z = 1, but if z 0 < O, then x(t)0; gradient systems of the same type are easily constructed.
We turn now to the model of Hopfield which is more challenging when introducing a delay because (3) is slightly more complicated than (5).Moreover, since ( 1) is not quite a gradient system, the perturbation will be not quite as large as in (14).
To obtain a delay system for (1) we let A(t) be an n x 1 matrix of piecewise continuous functions and let A be the ith component of A with Ai(t)] <1 for 0<t_<h and all i.
We now prove a simple lemma parallel to that of Lemma 1.
preferred form for showing limit sets, for boundedness we write (18) as While ( 18) is the which we can represent by dui/dt = iui + (ai/RiCi) i Ai(t s)ui(s)ds + f(t) or in vector notation as = Au + / D(t-s)u(s)ds + F(t) t-h (19)   where A is a diagonal matrix of constants i = 1/RiCi, D is a diagonal matrix of elements iAi(t s)/RiCi, and there is a constant P which is independent of the initial function with [r(t) _< P. Let A = rain A i, -6 mazlci/RiCil. (

20)
Lemma e: There is an cr > 0 (defined by ( 20) and ( 21)) such that if cl ,, then all solutions of (18) are bounded in the same sense as in Lemma 1.
4. A FULLY DELAYED GtLkDIENT SYSTEM In the Hopfield model, if the train of action potentials is also dependent on the average potential of the neuron itself, a simplified form of (10) would be u = f [ai(t-s)(OE(u(s))lOui)/g(ui(s))lds and the analog of (4 f ai(t-s)(OV(z(s))/Oxi)ds. (23) Thus, in (10) we are taking aij(t --ai(t for 1 _< j < n. T.A. BURTON then the memory can be eliminated at the expense of doubling the order of the system.In implementing the case in which (25) holds, in effect, we add a neuron to the Hopfield model, as is indicated by the increased dimension.
Equation (25) does yield a very reasonable memory system.Theorem 6 will reduce (25), but will restrict initial functions, as did Theorem 1.
Theorem 4: Let (25), ( 26) hold and let V be bounded below.solution of ( 24) is bounded and approaches the set of equilibrium points of (4).
To see that solutions are bounded, write (23) as t --00 3 t h(t) (l/Ri) / e-ai(t a}ui(s)ds where h(t) and h'(t) are bounded and the bound depends on the initial function.Then u 7 + aiu = h'(t) + aih(t) ui/R or all of whose solutions are bounded.This completes the proof.Now if one is interested in linear analysis of (23), such as was given by Marcus and Westervelt [12] for the pointwise delay with a view to obtaining local information, say near u = O, then (23) is written as u = / [el(t-s)(OE(u(s))/Oui)/i]ds (27) --00 where i = g(0) > 0. That is, we have linearized the denominator.
Then every bounded solution of (27) converges The proof is, of course, an exact repetition of that of Theorem 4.
We return now to (24) with initial conditions of Theorem 1. 540].That will complete the proof.
5. DISCUSSION System (10) seems to be a proper formulation for the general problem described by Hopfield and a more justifiable delay system than that of Marcus and Westervelt.It seems to be very difficult to evaluate its stability properties in its full generality, but it is significant that Lemma 1 yields boundedness of solutions.Analysis in the full generality is expected to be a long-term project, but the results of Section 4 indicate that (10) should be very stable.Noting that ( 1) is almost a gradient system should significantly enhance the stability analysis.

REMARKS ON MEMORY
The object of the memory is to enable the Tij in (1) to reflect the time lag.System (10) has a memory in every sense of the word.For a general aij(t), (10) can not be reduced to an ordinary differential equation without memory.When (25) holds, then systems (23) and ( 24) have limited memory in that ai(t can be removed at the expense of doubling the order. Any ordinary differential equation can be expressed as an integral equation and sometimes it appears to have a memory.For example, ttopfield's system can be written as h (t, u = 7iui q" sO that using the integrating factor e7it, we obtain = + f e- 0 (30) But since solutions of (30) are uniquely determined by (to, Uo) alone, equation (31) does not have a memory.
Section 4 has focused on a(t)= e -t and this can be generalized to a(t)= Eli(t) where each fi(t) is the solution of a linear homogeneous ordinary differential equation of degree n with constant coefficients.(See [1; p. 84]).

ACKNOWLEDGMENT
The author wishes to express his appreciation to Professor M.R. Sayeh for reading and criticizing the manuscript. 0