Exponential Decay for a System of Equations with Distributed Delays

c i (t), i, j = 1, . . . , m, are continuous functions and are subject to other conditions that will be specified below. Similar forms of this system arise, for instance, in Neural Network Theory [1–28] (see also the “Applications” section below).There, the functionsf ij (t) andg j (t) aremuch simpler. Usually, f ij (t) are equal to the identity and g j (t) (called the activation functions) are assumed to be Lipschitz continuous. The integral terms represent the distributed delays.When the kernels are replaced by the delta distribution we recover the well-known discrete delays. The functions c i (t) account for the input functions. The first terms in the right hand side of (1) may be looked at as dissipative terms. Different methods have been used by many authors to study the well-posedness and the asymptotic behavior of solutions of these systems [1–3, 6, 9–17, 19–21, 25, 27]. In particular, a lot of efforts are devoted to improving the conditions on the different coefficients involved in the system as well as on the class of activation functions. Regarding the latter issue, the early assumptions of boundedness, monotonicity, and differentiability have been all relaxed to merely a global Lipschitz condition. Since then, it seems that this assumption has not been weakened further considerably. It has been pointed out that there are many activation functions which are continuous but not necessarily Lipschitz continuous in applications [29]. A slightly weaker condition: x i g i (x i ) > 0,


Introduction
Of concern is the following system    () = −   ()   () ( − )   (  ()) ) for  = 1, . . .,  and  > 0 with   () =  0 () continuous on (−∞, 0].Here   () ≥ 0,   (),   (),   (),   () ≥ 0, and   (), ,  = 1, . . ., , are continuous functions and are subject to other conditions that will be specified below.Similar forms of this system arise, for instance, in Neural Network Theory  (see also the "Applications" section below).There, the functions   () and   () are much simpler.Usually,   () are equal to the identity and   () (called the activation functions) are assumed to be Lipschitz continuous.The integral terms represent the distributed delays.When the kernels are replaced by the delta distribution we recover the well-known discrete delays.The functions   () account for the input functions.The first terms in the right hand side of (1) may be looked at as dissipative terms.
Different methods have been used by many authors to study the well-posedness and the asymptotic behavior of solutions of these systems [1-3, 6, 9-17, 19-21, 25, 27].In particular, a lot of efforts are devoted to improving the conditions on the different coefficients involved in the system as well as on the class of activation functions.Regarding the latter issue, the early assumptions of boundedness, monotonicity, and differentiability have been all relaxed to merely a global Lipschitz condition.Since then, it seems that this assumption has not been weakened further considerably.It has been pointed out that there are many activation functions which are continuous but not necessarily Lipschitz continuous in applications [29].A slightly weaker condition:     (  ) > 0,   ̸ = 0, and there exist and  *  is the equilibrium, which has been used in [4,19,26,28] (see also [22][23][24]).Finally, we cite [5] where the authors consider non-Lipschitz continuous but bounded activation functions.There are also many works on discontinuous activation functions.
Here we assume that the functions   and   are continuous monotone nondecreasing functions that are not necessarily Lipschitz continuous and they may be unbounded (like power type functions with powers bigger than one).We prove that, for sufficiently small initial data, solutions decay to zero exponentially.
We could not find similar interesting works on (continuous but) non-Lipschitz continuous activation functions to 2 Journal of Applied Mathematics compare our results with.Our treatment is in fact concerned with a doubly non-Lipschitz continuous system.
Using standard techniques and the Gronwall-type lemma below we may prove local existence of solutions.The global existence follows from the estimation in our theorem below.The uniqueness, however, is delicate and does not hold in general.
In the next section we present and prove our result and illustrate it by an example.

Exponential Convergence
In this section we state and prove our exponential convergence result.Before that we need to present a lemma due to Bainov and Simeonov [30].
In order to shorten the statement of our result we define, for ,  = 1, . . ., , b ( for some   () to be determined.
(b)    of Arbitrary Signs.Define, for  > 0, (  () +  , ()) That is, We have from ( 16) and ( 24) that and by integration we find, for  > 0, with Next, we proceed as in Case (a) with the new functional  2 () (24), the constant  2 (0) (27), and (in (a) and λ , ω in (b)), then solutions of ( 1) are global in time.Moreover, if ∫  0 () → ∞ and   () (resp., ω ()) grows up at most polynomially, then the decay is exponential.Remark 4. We have judged it useful to treat case (a) separately even though it is covered by case (b) for the simple reason that this case arises in real applications as it corresponds to the "fading memory" situation.Same for the case    () ≤   (),  > 0 for some  > 0 too looks unnecessary to study separately as it is covered by the second case in the proof but in fact it is also quite interesting.Indeed, in this case, from (16) we have [ b ()   ( , ()) +   (0)   ( ()) Therefore, [ b ()   ( 1 ()) +   (0)   ( 1 ())] , > 0 (29) and thus At this point we must point out that, unlike in the proof of the theorem, we cannot pass to  1 () − inside the arguments of   and   (in (30)).However, if the functions   and   belong to the class , that is, there exist   and   such that   () ≤   ()  () and   () ≤   ()  (),  > 0,  ≥ 0, then for  > 0

Applications
This system appears in Neural Network Theory.For a basic one the reader is referred to [7,8].
A Neural Network is designed in order to mimic the human brain.It is formed by a number of "neurons" with interconnections between them.In general there are an input layer, some (one or more) hidden layers, and an output layer.The input neurons feed the neurons in the hidden layers which perform a transformation of the signal and fires it to the output neurons (or other neurons).They are widely used for solving optimization problems, analyzing, classifying, and evaluating many things.They have the advantage (over traditional computers) of forecasting, predicting, and making decisions.
There are numerous applications of which we cite the following: economic indicator, data compression, complex mapping, biological systems analysis, optimization, process control, time series analysis, stock market, diagnosis of hepatitis, engineering design, soil permeability, speech processing, pattern recognition, and so on.
Most of the existing papers in this theory deal with the constant coefficients case.The few papers on variable coefficients treat mainly the existence of periodic solutions.In the constant coefficients case the system will have the form for all  = 1, . . ., .For the asymptotic behavior we need this  * to be infinity.In particular we need a smallness condition on  0 ().