Being inspired by the Hopfield neural networks (Hopfield (1982) and Hopfield and Tank (1985)) and the nonlinear sigmoid power control algorithm for cellular radio systems in Uykan and Koivo (2004), in this paper, we present a novel discrete recurrent nonlinear
system and extend the results in Uykan (2009), which are for autonomous linear systems, to nonlinear case. The proposed system can be viewed as a discrete-time realization of a recently proposed continuous-time network in Uykan (2013). In this paper, we focus on discrete-time analysis and provide various novel key results concerning the discrete-time dynamics of the proposed system, some of which are as follows: (i) the proposed system is shown to be stable in synchronous and asynchronous work mode in discrete time; (ii) a novel concept called Pseudo-SINR (pseudo-signal-to-interference-noise ratio) is introduced for discrete-time nonlinear systems; (iii) it is shown that when the system states approach an equilibrium point, the instantaneous Pseudo-SINRs are balanced; that is, they are equal to a target value. The simulation results confirm the novel results presented and show the effectiveness of the proposed discrete-time network as applied to various associative memory systems and clustering problems.
1. Introduction
Artificial neural networks have been an important research area since 1970s. Since then, various biologically inspired neural network models have been developed. Hopfield Neural Networks [1, 2] have been one of the most widely used neural networks since the early 1980s whose applications vary from combinatorial optimization (e.g., [3, 4]) to image restoration (e.g., [5]) and from various control engineering optimization problems in robotics (e.g., [6]) to associative memory systems (e.g., [7, 8]). For a tutorial and further references about Hopfield neural networks, see, for example, [9, 10].
In [11], we introduce a novel pseudo-signal-to-interference-noise ratio concept for discrete-time autonomous linear systems. Our main motivation in this paper is to investigate a nonlinear extension of [11]. Furthermore, the proposed system can be viewed as a discrete-time realization of a very recently proposed continuous-time network called double-sigmoid continuous-time Hopfield neural network in a brief letter [12]. And our investigations in this paper yield various interesting key novel results in discrete time, some of which are as follows: (i) a novel concept called Pseudo-SINR (pseudo-signal-to-interference-noise ratio) is introduced for discrete-time nonlinear systems; (ii) it is shown that when the network approaches to one of its equilibrium points, the instantaneous Pseudo-SINRs are balanced; that is, they are equal to a target value; (iii) the proposed network outperforms its Hopfield neural network counterpart as applied to various associative memory systems and clustering applications. The disadvantage of the proposed network is that it increases the computational burden.
The paper is organized as follows. The proposed recurrent network and its stability features are analyzed in Section 2. Simulation results are presented in Section 3, followed by conclusions in Section 4.
Being inspired by the nonlinear sigmoid power control algorithm for cellular radio systems in [13] and the Hopfield neural networks [2], we propose the following discrete nonlinear recurrent network:
(1)x(k+1)=x(k)+α(k)f1(-Ax(k)+Wf2(x(k))+b),
where k represents the iteration step, A, W, and b are defined as in (2), and fm(·), m=1,2, represents a vectoral mapping from ℜN to ℜN. For an N-dimensional vector e=[e1e2⋯eN]T, fm(e)=[fm(e1)fm(e2)⋯fm(eN)]T where fm(·) is chosen as the sigmoid function; that is, for a real number ei, the output is fm(ei)=κm(1-(2/(1+exp(-σmei)))), where κm>0, σm>0. We will call the network in (1) as discrete sigmoid-pseudo-SINR-balancing recurrent neural network (D-SP-SNN). The name comes from the fact that the proposed network balances an entity called Pseudo-SINR, as will be seen in the following. In this paper, we choose sigmoid function because it’s used both in Hopfield neural network and the power control algorithm in [13]. Furthermore, the proposed D-SP-SNN can be viewed as a discrete-time implementation of a very recently proposed continuous-time network called double-sigmoid continuous-time Hopfield neural network in the brief [12]. In this paper, we focus on discrete-time analysis and provide various novel key results concerning the discrete-time dynamics of the proposed system:
(2)A=[a110⋯00a22⋯0⋮⋱000⋯aNN],W=[0w12⋯w1Nw210⋯w2N⋮⋱⋮wN1wN2⋯0],b=[b1b2⋮bN].
In (2), A shows the self-state-feedback matrix, W with zero diagonal shows the interneurons connection weight matrix, and b is a threshold vector.
The proposed network includes both the sigmoid power control in [13] and the traditional Hopfield neural network (HNN) as its special cases by choosing the f1(·) and f2(·) appropriately. The Euler approximation of the continuous-time HNN is given as
(3)x(k+1)=x(k)+α(k)(-Ax(k)+Wf2(x(k))+b).
Let us call the network in (3) HNN-Euler, which is a special case of the proposed D-SP-SNN in (1). From (1),
(4)xj(k+1)=xj(k)+α(k)×f1(-ajjxj(k)+bj+∑i=1,i≠jNwijf2(xi(k))),j=1,…,N,
where α(k) is the step size at time k. Let’s define the error signal ei(k) as
(5)ei(k)=-aiixi+Ii(k),whereIi(k)=bi+∑j=1,j≠iNwijf2(xj(k)),i=1,…,N.
Then, the performance index is defined as l1-norm of the error vector in (5) as follows:
(6)V(k)=∥e(k)∥1=∑iN|ei(k)|(7)=∑iN|-aiixi+Ii|,whereIi=bi+∑j=1,j≠iNwijf2(xj).
In what follows, we examine the evolution of the energy function in (6) in synchronous and asynchronous work modes. Asynchronous mode means that at every iteration step, at most only one state is updated, whereas synchronous mode refers to the case that all the states are updated at every iteration step according to (4).
Proposition 1.
In asynchronous mode of the proposed network D-SP-SNN in (4) with a symmetric matrix W, for a nonzero error vector, the l1-norm of the error vector in (6) decreases at every step; that is, the error vector goes to zero for any α(k) such that
(8)|ej(k)|>|ajjα(k)f1(ej(k))|
if
(9)|ajj|≥k2∑i=1,(i≠j)N|wij|,
where k2=0.5σ2 is the global Lipschitz constant of f2(·) as shown the in Appendix A.
Proof.
In asynchronous mode, only one state is updated at an iteration time. Let j show the index of the state which is updated at time k whose error signal is different than zero; that is, ej=-ajjxj+Ij≠0,whereIj=bj+∑i=1,i≠jNwjif2(xi), as defined in (5). Writing (5) in vector form for steps k and k+1 results in
(10)e(k+1)-e(k)=[00⋮-ajj(xj(k+1)-xj(k))⋮0]+[w1jw2j⋮0⋮wNj](f2(xj(k+1))-f2(xj(k))).
Using the error signal definition of (5) in (4) gives
(11)xj(k+1)-xj(k)=α(k)f1(ej(k)).
So, the error signal for state j is obtained using (10) and (11) as follows:
(12)ej(k+1)-ej(k)=-ajj(xj(k+1)-xj(k))(13)=-ajjα(k)f1(ej(k)).
From (12) and (13), if α(k) is chosen to satisfy |ej(k)|>|ajjα(k)f1(ej(k))|, then
(14)|ej(k+1)|<|ej(k)|,for|ei(k)|≠0,
where f1(·) is a sigmoid function, which is lower and upper bounded. Since the sigmoid function f1(·) has the same sign as its argument and f1(ej)=0 if and only if ej=0, then it is seen that α(k) can easily be chosen small enough to satisfy |ej(k)|>α(k)ajj|f1(ej(k))| according to the parameter ajj and the slope of sigmoid function f1(·).
Above, we examined only the state j and its error signal ej(k). In what follows, we examine the evolution of the norm of the complete error vector e(k+1) in (10). From the point of view of the l1 norm of the e(k+1), the worst case is that when |ej(k)| decreases, all other elements |ei(k)|,i≠j, increase. So, using (10), (12), and (14), we obtain that if
(15)|-ajj(xj(k+1)-xj(k))|≥|f2(xj(k+1))-f2(xj(k))|×∑i=1,(i≠j)N|wij|,
then
(16)∥e(k+1)∥1{<∥e(k)∥1if∥e(k)∥1≠0,=0if∥e(k)∥1=0.
The sigmoid function f2(·) is a Lipschitz continuous function as shown in Appendix A. So,
(17)k2|xj(k+1)-xj(k)|≥|f2(xj(k+1))-f2(xj(k))|,
where k2=0.5σ2 is f2(·)’s global Lipschitz constant as shown in Appendix A. From (15) and (17), choosing |ajj|>k2∑i=1,(i≠j)N|wij| yields (15), which implies (16). This completes the proof.
Definition 2 (pseudo-SINR).
We define the following system variable, which will be called pseudo-SINR, for the D-SP-SNN in (4):
(18)θ-iθitgt=aiif(xi)bi+∑j=1,j≠iNwijf(xj),i=1,…,N,
where f(·) represents the sigmoid function, that is, f(e)=κ(1-(2/(1+exp(-σe)))), and θitgt is a constant, which we call target θi.
Examining the θ-i in (18), we observe that it resembles the traditional signal-to-interference-noise ratio (SINR) definition in cellular radio systems (see, e.g., [14, 15]); therefore we call it Pseudo-SINR.
Definition 3 (prototype vectors).
Prototype vectors are defined as those x’s which make θi=θitgt,i=1,…,N, in (18). So, from (18) and (5), the prototype vectors make the error signal zero; that is, ei=0,i=1,…,N given that xi≠0 and Ii≠0.
Proposition 4.
In asynchronous mode, choosing the slope of f2(·) relatively small as compared to f1(·) and choosing ajj>0 and α(k) satisfying (8), the D-SP-SNN in (4) with a symmetric matrix W is stable and there exists a finite time constant such that the l1-norm of the error vector in (6) approaches to an ϵ-vicinity of the zero as its steady state, where ϵ is a relatively small positive number. If θ-i=θitgt at the converged point, then it corresponds to a prototype vector as defined above.
Proof.
Since it is asynchronous mode, (10)–(14) hold where ajj>0. So, if α(k) at time k is chosen to satisfy |ej(k)|>|ajjα(k)f1(ej(k))| as in (8), then
(19)|ej(k+1)|<|ej(k)|,for|ei(k)|≠0.
Note that it is straightforward to choose a sufficiently small α(k) to satisfy (8) according to ajj and the slope σ1 of sigmoid f1(·). Using (10), (12), and (19), it is seen for ej(k)≠0 that if
(20)|-ajj(xj(k+1)-xj(k))|=|-ajjα(k)f1(ej(k))|(21)>|f2(xj(k+1))-f2(xj(k))|×∑i=1,(i≠j)N|wij|,
then
(22)∥e(k+1)∥1<∥e(k)∥1.
We observe from (12), (20), (21), and (22) the following.
(1) If the xi(k),i=1,…,N, approach to either of the saturation regimes of its sigmoid function f2(·), then
(23)|f2(xj(k+1))-f2(xj(k))|∑i=1,(i≠j)N|wij|≈0,j=1,…,N,
since |f2(xj(k+1))-f2(xj(k))|≈0,i=1,…,N. That satisfies (20) and (21). Therefore, the norm of the error vector in (6) does not go to infinity and is finite for any x.
(2)x(k+1)=x(k) if and only if e(k)=0; that is,
(24)xj(k+1)=xj(k)ifandonlyiff1(ej(k))=0,j=1,…,N.
(3) Examining the (11), (12), and (13) taking the observations (1) and (2) into account, we conclude that any of the xj(k),j=1,…,N, does not go to infinity and is finite for any k. So, the D-SP-SNN in (4) with a symmetric matrix W is stable for the assumptions in Proposition 4. Because there is a finite number of insaturation states (i.e., the number of all possible insaturation state combinations is finite), which is equal to 2N, there exists a finite time constant such that the l1-norm of the error vector in (6) approaches to an ϵ-vicinity of the zero as its steady state, where ϵ is a relatively small positive number.
From (18), if θ-i=θitgt at the converged point, then it corresponds to a prototype vector as defined in the previous section, which completes the proof.
In what follows, we examine the evolution of pseudo-SINR θ-i(k) in (18). From (18), let us define the following error signal at time k:
(25)ξj(k)=-θj(k)+θjtgt,j=1,…,N.
Proposition 5.
In asynchronous mode, in the D-SP-SNN in (4) with a sufficiently small α(k) and with a symmetric matrix W, the θj(k) is getting closer to θjtgt at those iteration steps k where Ij(k)≠0; that is, |ξj(k+1)|<|ξj(k)|, where index j shows the state being updated at iteration k.
Proof.
Let j show the state which is updated at time k. The pseudo-SINR defined by (18) for nonzero Ij(k) is equal to
(26)θ-j(k)=aiixj(k)Ij(k),whereIj(k)=bj+∑i=1,i≠jNwjif(xj(k)).
Without loss of generality, and for the sake of simplicity, let is take θitgt=1. Then, from (26) and (25)
(27)ξj(k)=-θ-j(k)+1=-aiixj(k)+Ij(k)Ij(k).
In asynchronous mode, from (26), Im(k)=Im(k+1). Using this observation and (27),
(28)ξj(k+1)-ξj(k)=-aii(xj(k+1)-xj(k))Ij(k).
From (4) and (28),
(29)ξj(k+1)-ξj(k)=-aiiα(k)f1(ej(k))Ij(k).
Provided that Ij(k)≠0, we write, from (5) and (27),
(30)ej(k)=Ij(k)ξj(k).
Writing (30) in (29) gives
(31)ξj(k+1)-ξj(k)=-aiiα(k)f1(Ij(k)ξj(k))Ij(k).
From (31), since sigmoid function f1(·) is an odd function, and aii>0 and α(k)>0,
(32)ξj(k+1)=ξj(k)-βsign(ξj(k)),whereβ=|-aii(α(k)f1(Ij(k)ξj(k)))Ij(k)|.
As seen from (32), for a nonzero ξj(k), choosing a sufficiently small α(k) satisfying |ξj(k)|>β assures that
(33)|ξj(k+1)|<|ξj(k)|ifIm(k)≠0
which completes the proof.
Proposition 6.
The results in Propositions 1 and 4 for asynchronous mode hold also for synchronous mode.
In synchronous mode, all the states are updated at every step k according to (5). So, from (5)
(34)e(k+1)-e(k)=∑i=1N([00⋮-a11(xi(k+1)-xi(k))⋮0]+[w1iw2i⋮0⋮wNi](f2(xi(k+1))-f2(xi(k)))).
Using (5) in (34) and writing it elementwise give
(35)ei(k+1)=ei(k)-aiiα(k)f1(ei(k))+∑j=1,(j≠i)Nwij(f2(xj(k+1)-f2(xj(k))),i=1,…,N.
From (34) and (35), we obtain
(36)|-aii(xi(k+1)-xi(k))|=|-aiiα(k)f1(ei(k))|>|f2(xi(k+1))-f2(xi(k))|×∑j=1,(j≠i)N|wji|,i=1,…,N,
which is equal to (15) in Proposition 1 and (20) in Proposition 4.
It is well known that the performance of Hopfield network may highly depend on the parameter setting of the weight matrix (e.g., [8]). There are various ways for determining the weight matrix of the Hopfield networks: gradient-descent supervised learning (e.g., [16]), solving linear inequalities (e.g., [17, 18] among others), Hebb learning rule [19, 20], and so forth. How to design D-SP-SNN is out of the scope of this paper. The methods used for traditional Hopfield NN can also be used for the proposed networks D-SP-SNN. As far as the simulation results in Section 3 are concerned, we determine the matrices A, W, and b by using a Hebb learning-based algorithm [19] presented in Appendix B.
3. Simulation Results
In the simulation part, we examine the performance of the proposed D-SP-SNN in the area of associative memory systems and clustering problem. In Examples 7 and 8, we present some toy examples one with 8 neurons and one with 16 neurons, respectively, where the desired vectors are orthogonal. Lyapunov function of the HNN at time k is given as
(37)L(k)=-x(k)TWx(k)+x(k)Tb.
In Examples 7 and 8, we use discrete-time HNN just for comparison reasons, which is given by
(38)x(k+1)=sign(Wx(k)),
where W is the weight matrix and x(k) is the state at time k, and at most one state is updated at a time.
Example 7.
In this example of discrete-time networks, there are 8 neurons. The desired prototype vectors are as follows:
(39)D=[1111-1-1-1-111-1-111-1-11-11-11-11-1].
The weight matrices A and W and the threshold vector b are obtained as follows by using the outer-product-based design (Hebb-learning [19]) presented in Appendix B and the slopes of sigmoid functions f1(·) and f2(·) are set to σ1=10, κ1=10, and σ2=2, κ2=1, respectively, and ρ=0, α=0.1:
(40)A=3I,W=[011-11-1-1-310-11-11-3-11-101-1-31-1-1110-3-1-111-1-1-3011-1-11-3-110-11-1-31-11-101-3-1-11-1110],b=0.
Figure 1 shows the percentages of correctly recovered desired patterns for all possible initial conditions x(k)∈(-1,+1)8, for the proposed networks D-SP-SNN as compared to traditional discrete Hopfield network. In the proposed network D-SP-SNN, f1(·) is a sigmoid function. Establishing an analogy to the traditional fixed step 1-bit increase/decrease power control algorithm (e.g. [21, 22]), we replace the sigmoid function by the sign function and call corresponding network as fixed-step pseudo-SINR neural network (FSPSNN). For comparison reason its performance is also shown in Figure 1.
As seen from Figure 1 the performance of the proposed network D-SP-SNN is remarkably better than that of the traditional discrete Hopfield network for all Hamming distance cases. The FSP-SNN also considerably outperforms the Hopfield network for 1 and 2 Hamming distance cases while the all the networks perform poorly (less than 20%) at 3-Hamming distance case.
The figure shows the percentage of correctly recovered desired patterns for all possible initial conditions in Example 7 for the proposed D-SP-SNN and FSP-SNN as compared to traditional Hopfield network (8-neuron case).
Example 8.
The desired prototype vectors are(41)D=[11111111-1-1-1-1-1-1-1-11111-1-1-1-11111-1-1-1-111-1-111-1-111-1-111-1-11-11-11-11-11-11-11-11-1].
The weight matrices A and W and threshold vector b are obtained as follows by using the outer-product-based design (Hebb-learning [19]) in Appendix B:(42)A=4I,W=[0220200-2200-20-2-2-4200202-2002-20-20-4-220020-2200-220-2-40-20220-2002-2002-4-2-20200-202200-2-2-4200-202-202002-20-4-202-200-2202002-2-40-20-220-20020220-4-2-20-2002200-20-2-2-40220200-202-20-20-4-2200202-200-220-2-40-220020-220-2002-4-2-200220-20020-2-2-4200-2200-20220-20-4-202-2002-202002-2-40-20-2200-2202002-4-2-20-2002-20020220],b=0.
Figure 2 shows the percentage of correctly recovered desired patterns for all possible initial conditions x(k)∈(-1,+1)16, in the proposed D-SP-SNN and FSP-SNN as compared to discrete Hopfield network.
The total number of different possible combinations for the initial conditions for this example is 64, 480, 2240 and 7280 for 1-, 2-, 3-, and 4-Hamming distance cases, respectively, which could be calculated by md×C(16,K), where md=4 and K= 1, 2, 3, and 4.
As seen from Figure 2 the performance of the proposed networks D-SP-SNN and FSP-SNN is the same as that of discrete Hopfield Network for 1-Hamming and 2-Hamming distance cases (%100 for all networks). However, the D-SP-SNN and FSP-SNN give better performance than the discrete Hopfield network does for 3- and 4- Hamming distance cases.
Typical plots for evolution of states in Example 8 by the D-SP-SNN are shown in Figure 3. The evolution of corresponding pseudo-SINRs is given by Figure 4. The figure shows that the pseudo-SINRs approach to constant value 1 as states converge to the equilibrium point.
Evolutions of the Lyapunov function in (37) for the states of Figure 3 in Example 8 are given in Figure 5. The figure shows that the proposed D-SP-SNN minimizes the Lyapunov function of Hopfield neural network with the same weight matrix.
The figure shows the percentage of correctly recovered desired patterns for all possible initial conditions in Example 8 for the proposed D-SP-SNN and its 1-bit version FSP-SNN as compared to traditional Hopfield network (16-neuron case).
Typical plot for evolutions of states (a) 1 to 8 and (b) 9 to 16 in Example 8 by the D-SP-SNN.
Evolutions of pseudo-SINRs for the states in Figure 3 in Example 8 by the D-SP-SNN. (a) Pseudo-SINRs 1 to 8 and (b) pseudo-SINRs 9 to 16.
Evolution of Lyapunov function in (37) in Example 8 (N=16).
Example 9.
In Examples 7 and 8, the desired vectors are orthogonal. In this example, the desired vectors represent numbers 1, 2, 3, and 4, which are not orthogonal to each other. The numbers are represented by 25 neurons. The weight matrix is determined by the Hebb learning as in the previous examples. In the rest of the examples in this paper, we set σ1=1, κ1=10, σ2=10, κ2=1, and α(k)=0.01, for all k.
Figure 6 shows desired pattern 1, a distorted pattern 1 where the Hamming Distance (HD) is 5, the result of HNN-Euler, and the result of the D-SP-SNN using the distorted pattern as initial condition. As seen from the figure, the proposed D-SP-SNN succeeds to recover the number while the HNN-Euler fails for the same parameters and weight matrix.
The evolutions of the Lyapunov function in (37) and the norm of the difference between the state vector and equilibrium point for pattern 1 in Figure 6 are shown in Figure 7. As seen from the figure, (i) the proposed D-SP-SNN minimizes the Lyapunov function of Hopfield neural network, and (ii) the proposed D-SP-SNN converges faster than its HNN-Euler counterpart with the same weight matrix for this example.
Figure 8 shows desired pattern 2, a distorted pattern 2 where the HD is 5, the result of HNN-Euler, and the result of D-SP-SNN using the distorted pattern as initial condition. As seen from the figure, the proposed D-SP-SNN succeeds to recover the number while the HNN-Euler fails for the same parameters and weight matrix.
Evolutions of states in Example 9 for pattern 2 by HNN-Euler and by D-SP-SNN are shown in Figures 9 and 10, respectively. The figures show that the states of proposed D-SP-SNN converge faster than those of its HNN-Euler counterpart for the same parameter settings.
Figure 11 shows the evolutions of pseudo-SINRs of states in Example 9 for pattern 2 by D-SP-SNN. The figure shows that the pseudo-SINRs approach to constant value 1 as states converge to the equilibrium point.
The evolutions of the norm of the difference between the state vector and equilibrium point for pattern 2 in Figure 8 are shown in Figure 12. As seen from the figure, the proposed D-SP-SNN converges much faster than its HNN-Euler counterpart.
Figure 13 shows desired pattern 3, a distorted pattern 3 where the HD is 5, the result of HNN-Euler, and the result of the D-SP-SNN using the distorted pattern as initial condition. As seen from the figure, the proposed D-SP-SNN succeeds to recover the number while its HNN-Euler counterpart fails for the same parameters and weight matrix.
The evolutions of the Lyapunov function and the norm of the difference between the state vector and equilibrium point for pattern 3 in Figure 13 are shown in Figure 14. As seen from the figure, (i) the proposed D-SP-SNN minimizes the Lyapunov function of Hopfield neural network, and (ii) the proposed D-SP-SNN converges faster than its HNN-Euler counterpart with the same weight matrix for this example.
Figure 15 shows desired pattern 4, a distorted pattern 4 where the HD is 5, the result of HNN-Euler, and the result of the D-SP-SNN using the distorted pattern as initial condition. As seen from the figure, the proposed D-SP-SNN succeeds to recover the number while its HNN-Euler counterpart fails for the same parameters settings.
The evolutions of the Lyapunov function and the norm of the difference between the state vector and equilibrium point for pattern 4 in Figure 15 are shown in Figure 16. As seen from the figure, (i) the proposed D-SP-SNN minimizes the Lyapunov function of Hopfield Neural Network, and (ii) the proposed D-SP-SNN converges faster than its HNN-Euler counterpart with the same weight matrix for this example.
(a) Desired pattern 1, distorted pattern 1 (HD = 5), result of HNN-Euler, and result of D-SP-SNN in Example 9.
Evolution of (a) Lyapunov function in (37) and (b) norm of the difference between the state vector and equilibrium point in Example 9 for pattern 1 (N=25).
(a) Desired pattern 2, (b) distorted pattern 2 (HD = 5), (c) result of HNN-Euler, and (d) Result of D-SP-SNN in Example 9.
Evolutions of states (a) 1 to 8, (b) 9 to 16, and (c) 17 to 24 in Example 9 for pattern 2 by HNN-Euler.
Evolutions of states (a) 1 to 8, (b) 9 to 16, and (c) 17 to 24 in Example 9 for pattern 2 by D-SP-SNN.
Evolutions of pseudo-SINRs of states (a) 1 to 8, (b) 9 to 16, and (c) 17 to 24 in Example 9 for pattern 2 by D-SP-SNN.
Evolutions of the norm of the difference between the state vector and equilibrium point in Example 9 for pattern 2 (N=25).
(a) Desired pattern 3, (b) distorted pattern 3 (HD = 5), (c) result of HNN-Euler, and (d) result of D-SP-SNN in Example 9.
Evolution of (a) Lyapunov function and (b) norm of the difference between the state vector and equilibrium point in Example 9 for pattern 3 (N=25).
(a) Desired pattern 4, (b) distorted pattern 4 (HD = 5), (c) result of HNN-Euler, and (d) result of D-SP-SNN in Example 9.
Evolutions of (a) Lyapunov function and (b) norm of the difference between the state vector and equilibrium point in Example 9 for pattern 4 (N=25).
Example 10.
In this and in the following example, we examine the performance of the proposed D-SP-SNN in clustering problem. Clustering is used in a wide range of applications, such as engineering, biology, marketing, information retrieval, social network analysis, image processing, text mining, finding communities, influencers, and leaders in online or offline social networks. Data clustering is a technique that enables dividing large amounts of data into groups/clusters in an unsupervised manner such that the data points in the same group/cluster are similar and those in different clusters are dissimilar according to some defined similarity criteria. The clustering problem is an NP-complete, and its general solution even for 2-clustering case is not known. It is well known that the clustering problem can be formulated in the form of the Lyapunov function of the HNN. The weight matrix is chosen as the distance matrix of the dataset and is the same for both HNN-Euler and D-SP-SNN.
In what follows, we compare the performance of the proposed D-SP-SNN as compared to its HNN-Euler counterpart as applied to clustering problems for the very same parameter settings. Two-dimensional 16 data points to be bisected are shown in Figure 17. The clustering results are also shown in Figure 17. As seen from the figure, the D-SP-SNN finds the optimum solution for this toy example. HNN-Euler also gives the same solution.
The evolutions of states in the clustering by HNN-Euler and by D-SP-SNN are shown in Figures 18 and 19, respectively. As seen from the figures, the states of the proposed D-SP-SNN converge faster that those of its HNN-Euler counterpart.
The evolutions of psuedo-SINRs of states in the clustering by D-SP-SNN in Example 10 (N=16) are given by Figure 20. The figure shows that the pseudo-SINRs approach to constant value 1 as states converge to the equilibrium point.
The evolutions of Lyapunov function and the norm of the difference between the state vector and equilibrium point in Example 10 are given in Figure 21. The figure confirms that (i) the proposed D-SP-SNN minimizes the Lyapunov function of Hopfield neural network and (ii) the proposed D-SP-SNN converges faster than its HNN-Euler counterpart with the same weight matrix.
Result of clustering by D-SP-SNN, N=16 in Example 10.
Evolutions of states (a) 1 to 8 and (b) 9 to 16 in the clustering by HNN-Euler in Example 10 (N=16).
Evolutions of states (a) 1 to 8 and (b) 9 to 16 in the clustering by D-SP-SNN in Example 10 (N=16).
Evolutions of psuedo-SINRs of states (a) 1 to 8 and (b) 9 to 16 in the clustering by D-SP-SNN in Example 10 (N=16).
Evolution of (a) Lyapunov function and (b) norm of the difference between the state vector and equilibrium point in Example 10 (N=16).
Example 11.
In this example, there are 40 data points as shown in Figure 22. The figure also shows the bisecting clustering results by k-means algorithm and the proposed D-SP-SNN. As seen from the figure, while k-means fail to find the optimum clustering solution for this example (for a randomly given initial values), the proposed D-SP-SNN succeeds in finding the optimum solution (for the same initial values).
Figure 23 shows the evolution of pseudo-SINRs of states by the D-SP-SNN. The figure shows that the pseudo-SINRs approach to constant value 1 as states converge to the equilibrium point, as before.
Evolutions of the Lyapunov function and the norm of the difference between the state vector and equilibrium point are shown in Figure 24. The figure confirms the superior convergence speed of the D-SP-SNN as compared to its HNN-Euler counterpart.
Bisecting clustering results by (a) k-means and (b) D-SP-SNN in Example 11 (N=40).
Evolution of pseudo-SINRs of states 1 to 8 in Example 11 by the D-SP-SNN, (N=40).
Evolution of (a) Lyapunov function and (b) norm of the difference between the state vector and equilibrium point in Example 11 (N=40).
4. Conclusions
In this paper, we present and analyze a discrete recurrent nonlinear system which includes the Hopfield neural networks [1, 2] and the nonlinear sigmoid power control algorithm for cellular radio systems in [13], as special cases by properly choosing the functions. This paper extends the results in [11], which are for autonomous linear systems, to nonlinear case. The proposed system can be viewed as a discrete-time realization of a recently proposed continuous-time network in [12]. In this paper, we focus on discrete-time analysis and present various novel key results concerning the discrete-time dynamics of the proposed system, some of which are as follows: (i) the proposed network is shown to be stable in synchronous and asynchronous work mode in discrete time; (ii) a novel concept called Pseudo-SINR (pseudo-signal-to-interference-noise ratio) is introduced for discrete-time nonlinear systems; (iii) it is shown that when the network approaches one of its equilibrium points, the instantaneous Pseudo-SINRs become equal to a constant target value.
The simulation results confirm the novel results (e.g., Pseudo-SINR convergence, etc.) presented and show a superior performance of the proposed network as compared to its Hopfield network counterpart in various associative memory systems and clustering examples. Moreover, the results show that the proposed network minimizes the Lyapunov function of the Hopfield neural networks. The disadvantage of the D-SP-SNN is that it increases the computational burden.
AppendicesA. Lipschitz Constant of the Sigmoid Function
In what follows, we will show the sigmoid function (f(a)=1-(2/(1+exp(-σa))),σ>0) that has the global Lipschitz constant k=0.5σ. Since f(·) is a differentiable function, we can apply the mean value theorem:
(A.1)f(a)-f(b)=(a-b)f′(μa+(1-μ)(b-a))withμ∈[0,1].
The derivative of f(·) is f′(a)=-2σ/eσa(1+e-σa)2 whose maximum is at the point a=0; that is, |f′(a)|≤0.5σ. So we obtain the following inequality:
(A.2)|f(a)-f(b)|≤k|a-b|,
where k=0.5σ is the global Lipschitz constant of the sigmoid function.
B. Outer Product-Based Network Design
Let us assume that L desired prototype vectors are orthogonal and each element of a prototype vector is either −1 or +1.
Step 1.
Calculate the sum of outer products of the prototype vectors (Hebb Rule, [19])
(B.1)Q=∑s=1LdsdsT.
Step 2.
Determine the diagonal matrix A and W as follows:
(B.2)aij={qii+ρifi=j,0ifi≠j,i,j=1,…,N,
where ρ is a real number and
(B.3)wij={0ifi=j,qijifi≠j,i,j=1,…,N,
where qij shows the entries of matrix Q, N is the dimension of the vector x, and L is the number of the prototype vectors (N>L>0). In (B.2), qii=L from (B.1) since {ds} is from (-1,+1)N. It is observed that ρ=0 gives relatively good performance; however, by examining the nonlinear state equations in Section 2, it can be seen that the proposed networks D-SP-SNN and FSP-SNN contain the prototype vectors at their equilibrium points for a relatively large interval of ρ.
Another choice of ρ in (B.2) is ρ=N-2L which yields aii=N-L. In what follows we show that this choice also assures that {dj}j=1L are the equilibrium points of the networks.
From (B.1)–(B.3)
(B.4)[-A+W]=-(N-L)I+∑s=1LdsdsT-LI,
where I represents the identity matrix. Since ds∈(-1,+1)N, then ||ds||22=N. Using (B.4) and the orthogonality properties of the set {ds}s=1L gives the following:
(B.5)[-A+W]ds=-(N-L)ds+(N-L)ds=0.
So, the prototype vectors {dj}j=1L correspond to equilibrium points.
HopfieldJ. J.Neural networks and physical systems with emergent collective computational abilities198279825542558MR6520332-s2.0-0020118274HopfieldJ. J.TankD. W.‘Neural’ computation of decisons in optimization problems1985523141152MR824597MatsudaS.Optimal Hopfield network for combinatorial optimization with linear cost function199896131913302-s2.0-0032207549SmithK.PalaniswamiM.KrishnamoorthyM.Neural techniques for combinatorial optimization with applications199896130113182-s2.0-0032208429PaikJ. K.KatsaggelosA. K.Image restoration using a modified Hopfield network19921149632-s2.0-0026679070LendarisG. G.MathiaK.SaeksR.Linear Hopfield networks and constrained optimization19992911141182-s2.0-0033079282FarrellJ. A.MichelA. N.A synthesis procedure for Hopfield's continuous-time associative memory1990377877884MR1061873ZBL0715.940212-s2.0-002546528610.1109/31.55063MüezzinogluM. K.GüzelişC.ZuradaJ. M.An energy function-based design method for discrete Hopfield associative memory with attractive fixed points20051623703782-s2.0-1534434440910.1109/TNN.2004.841775ZuradaJ. M.1992West Publishing CompanyVidyasagarM.Location and stability of the high-gain equilibria of nonlinear neural networks1993446606722-s2.0-002763262810.1109/72.238320UykanZ.On the SIRs (“Signal” -to- “Interference” -Ratio) in discrete-time autonomous linear networksProceedings of the 1st International Conference on Advanced Cognitive Technologies and Applications (COGNITIVE '09)November, 2009Athens, GreeceUykanZ.Fast convergent double-sigmoid hopfield neural network as applied to optimization problems2013246990996UykanZ.KoivoH. N.Sigmoid-basis nonlinear power-control algorithm for mobile radio systems20045312652712-s2.0-124233090010.1109/TVT.2003.822327ZanderJ.Performance of optimum transmitter power control in cellular radio systems1992415762ZanderJ.Distributed cochannel interference control in cellular radio systems199241305311HaykinS.1999Macmillanvan den BergJ.The most general framework of continuous Hopfield neural networksProceedings of the 1st International Workshop on Neural Networks for Identification, Control, Robotics, and Signal/Image Processing (NICROSP '96)August 1996921002-s2.0-0029771068HarrerH.NossekJ. A.ZouF.A learning algorithm for time-discrete cellular neural networksProceedings of the IEEE International Joint Conference on Neural Networks (IJCNN '91)November 19917177222-s2.0-0027100046HebbD. O.1949New York, NY, USAJohn Wiley & SonsMüezzinoǧluM. K.GüzelişC.A Boolean Hebb rule for binary associative memory design20041511952022-s2.0-604422798910.1109/TNN.2003.820669HerdtnerJ. D.ChongE. K. P.Analysis of a class of distributed asynchronous power control algorithms for cellular wireless systems20001834364462-s2.0-003374324110.1109/49.840202KimD.On the convergence of fixed-step power control algorithms with binary feedback for mobile communication systems20014922492522-s2.0-003524836510.1109/26.905878