A gradient-based neural network (GNN) is improved and presented for the linear algebraic equation solving. Then, such a GNN model is used for the online solution of the convex quadratic programming (QP) with equality-constraints under the usage of Lagrangian function and Karush-Kuhn-Tucker (KKT) condition. According to the electronic architecture of such a GNN, it is known that the performance of the presented GNN could be enhanced by adopting different activation function arrays and/or design parameters. Computer simulation results substantiate that such a GNN could obtain the
accurate solution of the QP problem with an effective manner.
1. Introduction
A variety of scientific research and practical applications can be finalized as a matrix equation solving problem [1–6]. For example, the analysis of stability and perturbation for a control system could be viewed as the solution of Sylvester matrix equation [1, 2]; the stability and/or robustness properties of a control system could be obtained by the Lyapunov matrix equations solving [4–6]. Therefore, the real-time solution to matrix equation plays a fundamental role in numerous fields of science, engineering, and business.
As for the solution of matrix equations, many numerical algorithms have been proposed. In general, the minimal arithmetic operations of numerical algorithms are usually proportional to the cube of the dimension of the coefficient matrix, that is, O(N3) [7]. In order to be satisfied with the low complexity and real-time requirements, recently, numerous novel neural networks have been exploited based on the hardware implementation [2, 4, 5, 8–13]. For example, Tank and Hopfield solved the linear programming problems by using their proposed Hopfield neural networks (HNN) [9], which promoted the development of the neural networks in the optimization and other application problems. Wang in [10] proposed a kind of recurrent neural networks (RNN) models for the online solution of the linear simultaneous equations in parallel-processing circuit-implementation. In the previous work [2, 4, 5, 11], By Zhang's design method, a new type of RNN models is proposed for the solution of linear matrix-vector equation associated with the time-varying coefficient matrices in real-time.
In this paper, based on the Wang neural networks [10], we present an improved gradient-based neural model for the linear simultaneous equation, and then, such neural model is applied to solve the quadratic programming with equality-constraints. Much investigation and analysis on the Wang neural network have been presented in the previous work [10, 12, 13]. To make full use of the Wang neural network, we transform the convex quadratic programming into the general linear matrix-equation. Moreover, inspired by the design method of Zhang neural networks [2, 4, 5, 11, 12], the gradient-based neural network (GNN), that is, the Wang neural network, is improved, developed, and investigated for the online solution of the convex quadratic programming with the usage of Lagrangian function and Karush-Kuhn-Tucker (KKT) condition. In Section 5, the computer simulation results show that, by improving their structures, we could also obtain the better performance for the existing neural network models.
2. Neural Model for Linear Simultaneous Equations
In this section, a gradient-based neural networks (GNN) model is presented for the linear simultaneous equations:
(1)Ax=b,
where the nonsingular coefficient matrix A:=[aij]∈Rn×n and the coefficient vector b:=[b1,b2,b3,…,bn]T∈Rn are given as constants and x∈Rn is an unknown vector to be solved to make (1) hold true.
According to the traditional gradient-based algorithm [8, 10, 12], a scalar-valued norm-based energy function ε(x)∶=∥Ax-b∥22/2 is firstly constructed, and then evolving along the descent direction resulting from such energy function, we could obtain the linear GNN model for the solution of linear algebraic (1); that is,
(2)x˙=-γAT(Ax-b),
where γ>0 denotes the constant design parameter (or learning rate) used to scale the converge rate. To improve the convergence performance of neural networks, inspired by Zhang’s neural networks [2, 4, 5, 11, 12], the linear model (2) could be improved and reformulated into the following general nonlinear form:
(3)x˙=-ΓATF(Ax-b),
where design parameter Γ∈Rn×n is a positive-definite matrix, which is used to scale the convergence rate of the solution. For simplicity, we can use γI in place of Γ with γ>0∈R [4, 11]. In addition, the activation-function-array F(·):Rn→Rn is a matrix-valued mapping, in which each scalar-valued process unit f(·) is a monotonically increasing odd function. In general, four basic types of activation functions, linear, power, sigmoid, and power-sigmoid functions, can be used for the construction of neural solvers [4, 11]. The behavior of these four functions is exhibited in Figure 1, which shows that a different convergence performance could be achieved by using different activation functions. Furthermore, new activation functions could also be generated readily based on the above four activation functions. As for the neural model (3), we have the following theorem.
Behavior of the four basic activation functions.
Linear function
Sigmoid function
Power function
Power-sigmoid function
Theorem 1.
Consider a constant nonsingular coefficient-matrix A∈Rn×n and coefficient vector b∈Rn. If a monotonically increasing odd activation-function array F(·) is used, the neural state x(t) of neural model (3), starting from any initial state x(0)∈Rn, would converge to the unique solution x*=A-1b of linear equation (1).
Proof.
Let solution error x^(t)=x(t)-x*(t). For brevity, hereafter argument t is omitted. Then, from (3), we have
(4)x^˙=-γATF(Ax^),
where Γ=γI for simplicity. Therefore, its entry-form could be written as
(5)x^˙i=-γ∑j=1najif(∑j=1n(aijx^j)),∀i,j∈{1,2,3,…,n}.
Then, to analyze subsystem (5), we can define a Lyapunov candidate function as vi(t)=(x^i2)/2. Obviously, vi(t)>0 for x^i≠0, and vi(t)=0 only for x^i=0. Thus, the Lyapunov candidate function vi(t) is a nonnegative function. Furthermore, combining subsystem (5), we could get the time-derivative function of vi(t) as follows
(6)dvi(t)dt=x^ix^˙i=-γx^i∑j=1najif(∑j=1n(aijx^j))=-γ(∑j=1n(aijx^j))Tf(∑j=1n(aijx^j))=-γyTf(y),
where y=∑j=1n(aijx^j). Since f(·) is an odd monotonically increasing function, we have f(-u)=-f(u) and
(7)f(u){>0ifu>0,=0ifu=0,<0ifu<0.
Therefore, γyTf(y)>0 if y≠0, and γyTf(y)=0 if and only if y=0. In other words, the time-derivative v˙i(t)=-γyTf(y) is nonpositive for any y. This can guarantee that v˙i(t) is a negative-definite function. By Lyapunov theory [14, 15], each entry of solution error x^i in subsystem (5) can converge to zero; that is, x^i→0. This means that solution error x^(t)=x(t)-x*→0 as time t→∞. Therefore, the neural state x(t) of neural model (3) could converge to the unique solution x*=A-1b of linear equation (1). The proof on the convergence of neural model (3) is thus completed.
3. Problem Formulation on Quadratic Programming
An optimization problem characterized by a quadratic objection function and linear constraints is named as a quadratic programming (QP) problem [16–18]. In this paper, we consider the following quadratic programming problem with equality-constraints:(8)minimizexTPx2+qTx,subjecttoAx=b,
where P∈Rn×n is a positive definite Hessian matrix, coefficients q∈Rn and b∈Rm are vectors, and A∈Rm×n is a full row-rank matrix. They are known as constant coefficients of the to be solved QP problem (8).
Therefore, x∈Rn is unknown to be solved so as to make QP problem (8) hold true; especially, if there is no constraint, (8) is also called quadratic minimum (QM) problem. Mathematically, (8) can be written as minimizef(x)=(1/2)xTPx+qTx. For analysis convenience, let x* denote the theoretical solution of QP problem (8).
To solve QP problem (8), firstly, let us consider the following general form of quadratic programming:
(9)minimizef(x),subjecttohj(x)=0,j∈{1,2,3,…,m}.
As for (9), a Lagrangian function could be defined as
(10)L(x,λ)=f(x)+∑j=1mλjhj(x)=f(x)+λTh(x),
where λ=[λ1,λ2,λ3,…,λm]T denotes the Lagrangian multiplier vector and equality constraint h(x)=[h1(x),h2(x),h3(x),…,hm(x)]T. Furthermore, by following the previously-mentioned Lagrangian function and Karush-Kuhn-Tucker (KKT) condition, we have
(11)∂L(x,λ)∂x=Px+q+ATλ=0,∂L(x,λ)∂λ=Ax-b=0.
Then, (11) could be further formulated as the following matrix-vector form:
(12)P~x~=-q~,
where P~:=[PATA0m×m]∈R(n+m)×(n+m), x~:=[xλ]∈R(n+m), and q~:=[q-b]∈R(n+m). Therefore, we can obtain the solution x∈Rn of (8) by transforming QP problem (8) into matrix-vector equation (12). In other words, to get the solution x∈Rn of (8), QP problem (8) is firstly transformed into the matrix-vector equation (12), which is a linear matrix-vector equation similar to the linear simultaneous equations (1), and then, we thus can make full use of the neural solvers presented in Section 2 to solve the QP problem (8). Moreover, the first n elements of solution x~(t) of (12) compose the neural solution x(t) of (8), and the Lagrangian vector λ consists of the last m elements.
4. Application to QP Problem Solving
For analysis and comparison convenience, Let x~*=[x*T,λ*T]T denote the theoretical solution of (12). Since QP problem (8) could be formulated into the matrix-vector form (12), we can directly utilize the neural solvers (2) and (3) to solve problem (12). Therefore, neural solver (2) used to solve (12) can be written as the following linear form:
(13)x~˙=-γP~TP~x~-γP~Tq~=-γP~T(P~x~+q~).
If such linear model is activated by the nonlinear function arrays, we have
(14)x~˙=-ΓP~TF(P~x~+q~).
In addition, according to model (14), we can also draw its architecture for the electronic realization, as illustrated in Figure 2. From model (14) and Figure 2, we readily know that different performance of (14) can be achieved by using different activation function arrays F(·) and design parameter Γ. In the next section, the previously-mentioned four basic functions are used to simulate model (14) for achieving different convergence performance. In addition, from Theorem 1 and [4, 12], we have the following theorem on the convergence performance of GNN model (14).
Block diagram of the GNN model (14).
Theorem 2.
Consider the time-invariant strictly-convex quadratic programming problem (8). If a monotonically increasing odd activation-function array F(·) is used, the neural state x~(t):=[xT,λT]T of GNN model (14) could globally converge to the theoretical solution x~*(t):=[x*T,λ*T]T of the linear matrix-vector form (12). Note that, the first n elements of x~(t) are corresponding to the theoretical solution x* of QP problem (8), and the last m elements are those of the Lagrangian vector λ.
5. Simulation and Verification
In this section, neural model (14) is applied to solve the QP problem (8) in real-time for verification. As an illustrative example, consider the following QP problem:
(15)minimizex12+2x22+x32-2x1x2+x3,subjecttox1+x2+x3=4,subjectto2x1-x2+x3=2,subjecttox1>0,x2>0,x3>0.
Obviously, we can write the equivalent matrix-vector form of QP problem (8) with the following coefficients:
(16)P=[2-20-240002],q=[001],A=[1112-11],b=[24].
For analysis and compassion, we can utilize the MATLAB routine “quadprog” to obtain the theoretical solution of QP (15), that is, x*=[1.9091,1.9545,0.1364]T.
According to Figure 2, GNN model (14) is applied to the solution of QP problem (15) in real-time, together with the usage of power-sigmoid function array and design parameter γ=1. As shown in Figure 3, we know that, when starting from randomly-generated initial state x~0=[-2,2]∈R5, the neural state x~(t) of GNN model (14) is fit well with the theoretical solution after 10 seconds or so. That is, GNN model (14) could achieve the exact solution. Note that the first n=3 elements of neural solution are corresponding to the theoretical solution x*=[1.9091,1.9545,0.1364]T, while the last m=2 elements are the Lagrangian multiplier vector.
Neural state x~(t) of QP problem (15) by the GNN model (14) with the usage of power-sigmoid function and design parameter γ=1.
In addition, the residual error ∥P~x~+q~∥F2 could be used to track the solution-process. The trajectories of residual error could be shown in Figure 4, which is generated by GNN model (14) solving QP problem (15) activated by different activation function arrays, that is, linear, power, sigmoid, and power-sigmoid functions, respectively. Obviously, under the same simulation environments (such as, design parameter and GNN model (14)), different convergence performance could be achieved when different activation function arrays are used. As shown in Table 1, we use GNNlin, GNN_{power}, GNN_{sig}, and GNN_{ps} to denote the performance of residual error obtained by GNN model (14) activated by linear, power, sigmoid, and power-sigmoid function arrays and have the following simulative results.
When the same design parameter γ is used, the performance of GNN_{ps} is the best, while the residual-error of GNN_{power} is bigger. For example, when design parameter γ=1, (GNNps=3.68×10-3)<(GNNlin=1.74×10-2)<(GNNsig=2.07×10-2)<(GNNpower=0.267).
When the same activation functions are used, the performance of residual-error would be better with the increase of the value of design parameter γ. For example, when linear functions are used, the values of residual-error are 1.74×10-2, 1.89×10-3, and 7.24×10-4 corresponding to γ=1, γ=10, and γ=100, respectively.
Performance of GNN model by using different design parameter γ and activations.
γ
GNNlin
GNNpower
GNNsig
GNNps
1
1.74×10-2
0.267
2.07×10-2
3.68×10-3
10
1.89×10-3
8.93×10-2
2.56×10-3
1.03×10-3
100
7.24×10-4
2.65×10-2
9.12×10-4
3.32×10-4
Online solution of QP problem (15) by GNN (14) with design parameter γ=1 and the four basic activations.
Among the four basic activation functions, we could achieve the best convergence performance when using power-sigmoid functions under the same situations. Therefore, GNN model (14) has the best convergence performance when using power-sigmoid function, while when using power function, there exist apparent residual errors between the neural state x~(t) and theoretical solution x*. We thus generally use power-sigmoid activation function to achieve the superior convergence performance, as shown in Figure 3.
6. Conclusions
On the basis of the Wang neural network, an improved gradient-based neural network has been presented to the solution of the convex quadratic programming problem in real-time. Compared to the other three activation functions, the power-sigmoid function is the best choice for the superior convergence performance. Computer simulation results further substantiate that the presented GNN model could solve the convex QP problem with accuracy and efficiency, and the convergence performance could be obtained by using the power-sigmoid activation function.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant 61363076 and the Programs for Foundations of Jiangxi Province of China (GJJ13649, GJJ12367, GJJ13435, and 20122BAB211019) and partially supported by the Shenzhen Programs (JC201005270257A and JC201104220255A).
LiuP.ZhangS.LiQ.On the positive definite solutions of a nonlinear matrix equationZhangY.JiangD.WangJ.A recurrent neural network for solving sylvester equation with time-varying coefficientsLiY.ReissleinM.ChakrabartiC.Energy-efficient video transmission over a wireless linkYiC.ChenY.LanX.Comparison on neural solvers for the Lyapunov matrix equation with stationary & nonstationary coefficientsDingF.ChenT.Gradient based iterative algorithms for solving a class of matrix equationsGhorbaniM. A.KisiO.AalinezhadM.A probe into the chaotic nature of daily streamflow time series by correlation dimension and largest Lyapunov methodsZhangY.LeitheadW. E.Exploiting Hessian matrix and trust-region algorithm in hyperparameters estimation of Gaussian processZouX.TangY.BuS.LuoZ.ZhongS.Neural-network-based approach for extracting eigenvectors and eigenvalues of real normal matrices and some extension to real matricesTankD. W.HopfieldJ. J.Simple neural optimization networks: an A/D converter, signal decision circuit, and a linear programming circuitWangJ.Electronic realisation of recurrent neural network for solving simultaneous linear equationsZhangY.YiC.MaW.Simulation and verification of Zhang neural network for online time-varying matrix inversionYiC.ZhangY.Analogue recurrent neural network for linear algebraic equation solvingChenK.Robustness analysis of Wang neural network for online linear equation solvingZhangY.Dual neural networks: design, analysis, and application to redundant roboticsZhangY.WangJ.Global exponential stability of recurrent neural networks for synthesizing linear feedback control systems via pole assignmentPetrotN.Some existence theorems for nonconvex variational inequalities problemsBurerS.VandenbusscheD.A finite branch-and-bound algorithm for nonconvex quadratic programming via semidefinite relaxationsDostálZ.KučeraR.An optimal algorithm for minimization of quadratic functions with bounded spectrum subject to separable convex inequality and linear equality constraints