Linear Simultaneous Equations’ Neural Solution and Its Application to Convex Quadratic Programming with Equality-Constraint

. A gradient-based neural network (GNN) is improved and presented for the linear algebraic equation solving. Then, such a GNN model is used for the online solution of the convex quadratic programming (QP) with equality-constraints under the usage of Lagrangian function and Karush-Kuhn-Tucker (KKT) condition. According to the electronic architecture of such a GNN, it is known that the performance of the presented GNN could be enhanced by adopting different activation function arrays and/or design parameters. Computer simulation results substantiate that such a GNN could obtain the accurate solution of the QP problem with an effective manner.


Introduction
A variety of scientific research and practical applications can be finalized as a matrix equation solving problem [1][2][3][4][5][6].For example, the analysis of stability and perturbation for a control system could be viewed as the solution of Sylvester matrix equation [1,2]; the stability and/or robustness properties of a control system could be obtained by the Lyapunov matrix equations solving [4][5][6].Therefore, the real-time solution to matrix equation plays a fundamental role in numerous fields of science, engineering, and business.
As for the solution of matrix equations, many numerical algorithms have been proposed.In general, the minimal arithmetic operations of numerical algorithms are usually proportional to the cube of the dimension of the coefficient matrix, that is, ( 3 ) [7].In order to be satisfied with the low complexity and real-time requirements, recently, numerous novel neural networks have been exploited based on the hardware implementation [2,4,5,[8][9][10][11][12][13].For example, Tank and Hopfield solved the linear programming problems by using their proposed Hopfield neural networks (HNN) [9], which promoted the development of the neural networks in the optimization and other application problems.Wang in [10] proposed a kind of recurrent neural networks (RNN) models for the online solution of the linear simultaneous equations in parallel-processing circuit-implementation.In the previous work [2,4,5,11], By Zhang's design method, a new type of RNN models is proposed for the solution of linear matrix-vector equation associated with the timevarying coefficient matrices in real-time.
In this paper, based on the Wang neural networks [10], we present an improved gradient-based neural model for the linear simultaneous equation, and then, such neural model is applied to solve the quadratic programming with equalityconstraints.Much investigation and analysis on the Wang neural network have been presented in the previous work [10,12,13].To make full use of the Wang neural network, we transform the convex quadratic programming into the general linear matrix-equation.Moreover, inspired by the design method of Zhang neural networks [2,4,5,11,12], the gradient-based neural network (GNN), that is, the Wang neural network, is improved, developed, and investigated for the online solution of the convex quadratic programming with the usage of Lagrangian function and Karush-Kuhn-Tucker (KKT) condition.In Section 5, the computer simulation results show that, by improving their structures, we could also obtain the better performance for the existing neural network models.

Neural Model for Linear Simultaneous Equations
In this section, a gradient-based neural networks (GNN) model is presented for the linear simultaneous equations: where the nonsingular coefficient matrix  := [  ] ∈  × and the coefficient vector  := [ 1 ,  2 ,  3 , . . .,   ]  ∈   are given as constants and  ∈   is an unknown vector to be solved to make (1) hold true.
According to the traditional gradient-based algorithm [8,10,12], a scalar-valued norm-based energy function () := ‖ − ‖ 2  2 /2 is firstly constructed, and then evolving along the descent direction resulting from such energy function, we could obtain the linear GNN model for the solution of linear algebraic (1); that is, where  > 0 denotes the constant design parameter (or learning rate) used to scale the converge rate.To improve the convergence performance of neural networks, inspired by Zhang's neural networks [2,4,5,11,12], the linear model (2) could be improved and reformulated into the following general nonlinear form: where design parameter Γ ∈  × is a positive-definite matrix, which is used to scale the convergence rate of the solution.
For simplicity, we can use  in place of Γ with  > 0 ∈  [4,11].In addition, the activation-function-array (⋅) :   →   is a matrix-valued mapping, in which each scalar-valued process unit (⋅) is a monotonically increasing odd function.
In general, four basic types of activation functions, linear, power, sigmoid, and power-sigmoid functions, can be used for the construction of neural solvers [4,11].The behavior of these four functions is exhibited in Figure 1, which shows that a different convergence performance could be achieved by using different activation functions.Furthermore, new activation functions could also be generated readily based on the above four activation functions.As for the neural model (3), we have the following theorem.Proof.Let solution error x() = () −  * ().For brevity, hereafter argument  is omitted.Then, from (3), we have where Γ =  for simplicity.Therefore, its entry-form could be written as Then, to analyze subsystem (5), we can define a Lyapunov candidate function as V  () = ( x2  )/2.Obviously, V  () > 0 for x ̸ = 0, and V  () = 0 only for x = 0. Thus, the Lyapunov candidate function V  () is a nonnegative function.Furthermore, combining subsystem (5), we could get the time-derivative function of V  () as follows where  = ∑  =1 (  x ).Since (⋅) is an odd monotonically increasing function, we have (−) = −() and Therefore,   () > 0 if  ̸ = 0, and   () = 0 if and only if  = 0.In other words, the time-derivative V  () = −  () is nonpositive for any .This can guarantee that V  () is a negative-definite function.By Lyapunov theory [14,15], each entry of solution error x in subsystem (5) can converge to zero; that is, x → 0. This means that solution error x() = () −  * → 0 as time  → ∞.Therefore, the neural state () of neural model (3) could converge to the unique solution  * =  −1  of linear equation (1).The proof on the convergence of neural model ( 3) is thus completed.

Problem Formulation on Quadratic Programming
An optimization problem characterized by a quadratic objection function and linear constraints is named as a quadratic programming (QP) problem [16][17][18].In this paper, we consider the following quadratic programming problem with equality-constraints: where  ∈  × is a positive definite Hessian matrix, coefficients  ∈   and  ∈   are vectors, and  ∈  × is a full row-rank matrix.They are known as constant coefficients of the to be solved QP problem (8).
Therefore,  ∈   is unknown to be solved so as to make QP problem (8) hold true; especially, if there is no constraint, (8) is also called quadratic minimum (QM) problem.Mathematically, (8) can be written as minimize() = (1/2)   +   .For analysis convenience, let  * denote the theoretical solution of QP problem (8).

Application to QP Problem Solving
For analysis and comparison convenience, Let x * = [ *  ,  *  ]  denote the theoretical solution of (12).Since QP problem ( 8) could be formulated into the matrix-vector form (12), we can directly utilize the neural solvers ( 2) and ( 3) to solve problem (12).Therefore, neural solver (2) used to solve (12) can be written as the following linear form: If such linear model is activated by the nonlinear function arrays, we have In addition, according to model ( 14), we can also draw its architecture for the electronic realization, as illustrated in Figure 2. From model ( 14) and Figure 2, we readily know that different performance of ( 14) can be achieved by using different activation function arrays (⋅) and design parameter Γ.In the next section, the previously-mentioned four basic functions are used to simulate model ( 14) for achieving different convergence performance.In addition, from Theorem 1 and [4,12], we have the following theorem on the convergence performance of GNN model (14).
Theorem 2. Consider the time-invariant strictly-convex quadratic programming problem (8).If a monotonically increasing odd activation-function array (⋅) is used, the neural state x() := [  ,   ]  of GNN model (14) could globally converge to the theoretical solution x * () := [ *  ,  *  ]  of the linear matrix-vector form (12).Note that, the first  elements of x() are corresponding to the theoretical solution  * of QP problem (8), and the last  elements are those of the Lagrangian vector .

Simulation and Verification
In this section, neural model ( 14) is applied to solve the QP problem (8) in real-time for verification.As an illustrative example, consider the following QP problem: Obviously, we can write the equivalent matrix-vector form of QP problem (8) with the following coefficients: For analysis and compassion, we can utilize the MATLAB routine "quadprog" to obtain the theoretical solution of QP (15), that is,  * = [1.9091,1.9545, 0.1364]  .According to Figure 2, GNN model ( 14) is applied to the solution of QP problem (15) in real-time, together with the usage of power-sigmoid function array and design parameter  = 1.As shown in Figure 3, we know that, when starting from randomly-generated initial state x0 = [−2, 2] ∈  5 , the neural state x() of GNN model ( 14) is fit well with the theoretical solution after 10 seconds or so.That is, GNN model ( 14) could achieve the exact solution.Note that the first  = 3 elements of neural solution are corresponding to the theoretical solution  * = [1.9091,1.9545, 0.1364]  , while the last  = 2 elements are the Lagrangian multiplier vector.
In addition, the residual error ‖ P x + q‖ 2  could be used to track the solution-process.The trajectories of residual error could be shown in Figure 4, which is generated by GNN model (14) solving QP problem (15) activated by different activation function arrays, that is, linear, power, sigmoid, and power-sigmoid functions, respectively.Obviously, under  (i) When the same design parameter  is used, the performance of GNN ps is the best, while the residualerror of GNN power is bigger.For example, when design parameter  = 1, (GNN ps = 3.68×10 −3 ) < (GNN lin = 1.74×10 −2 ) < (GNN sig = 2.07×10 −2 ) < (GNN power = 0.267).
(ii) When the same activation functions are used, the performance of residual-error would be better with the increase of the value of design parameter .For example, when linear functions are used, the values of residual-error are 1.74 × 10 −2 , 1.89 × 10 −3 , and 7.24 × 10 −4 corresponding to  = 1,  = 10, and  = 100, respectively.
Among the four basic activation functions, we could achieve the best convergence performance when using power-sigmoid functions under the same situations.Therefore, GNN model ( 14) has the best convergence performance when using power-sigmoid function, while when using power function, there exist apparent residual errors between the neural state x() and theoretical solution  * .We thus generally use power-sigmoid activation function to achieve the superior convergence performance, as shown in Figure 3.

Conclusions
On the basis of the Wang neural network, an improved gradient-based neural network has been presented to the solution of the convex quadratic programming problem in real-time.Compared to the other three activation functions, the power-sigmoid function is the best choice for the superior convergence performance.Computer simulation results further substantiate that the presented GNN model could solve the convex QP problem with accuracy and efficiency, and the convergence performance could be obtained by using the power-sigmoid activation function.

Figure 1 :
Figure 1: Behavior of the four basic activation functions.

Table 1 :
Performance of GNN model by using different design parameter  and activations.As shown in Table1, we use GNN lin , GNN power , GNN sig , and GNN ps to denote the performance of residual error obtained by GNN model (14) activated by linear, power, sigmoid, and power-sigmoid function arrays and have the following simulative results.