General Recurrent Neural Network for Solving Generalized Linear Matrix Equation

This brief proposes a general framework of the nonlinear recurrent neural network for solving online the generalized linear matrix equation (GLME) with global convergence property. If the linear activation function is utilized, the neural state matrix of the nonlinear recurrent neural network can globally and exponentially converge to the unique theoretical solution of GLME. Additionally, as compared with the case of using the linear activation function, two specific types of nonlinear activation functions are proposed for the general nonlinear recurrent neural network model to achieve superior convergence. Illustrative examples are shown to demonstrate the efficacy of the general nonlinear recurrent neural network model and its superior convergence when activated by the aforementioned nonlinear activation functions.


Introduction
Solving the generalized linear matrix equation (GLME) and its variants is an important issue which is widely encountered in scientific and engineering areas (e.g., feedback control system design [1], smart antenna array processing [2]).The well-known Lyapunov equation and Sylvester equation can be regarded as main special cases of GLME with reduced numbers of coefficients and variable matrices, which have drawn widespread interest of researchers and engineers in the past decades [3][4][5][6][7].Without loss of generality, in this brief, the GLME problem is formulated as the following form: where   ∈  × ,   ∈  × , and  ∈  × denote coefficient matrices and  ∈  × denotes the unknown matrix to be obtained.Usually, it is complicated to analyze what circumstances the solution of (1) would be under in the traditional numerical way.To guarantee such GLME (1) solvable with the unique theoretical solution, coefficient matrices   ∈  × and   ∈  × can be practically configured with their eigenvalues all being positive or negative simultaneously.In many cases, the number of solutions of (1) can be multiple or even none, depending on what kind of combinations matrices   and   would make to associate with the unknown matrix .A lot of conventional serial approaches may be not efficient enough to solve online GLME due to their inherent drawbacks, and parallel computational approaches seem more preferable [8][9][10][11][12][13].
Regarded as another promising approach for parallel computation, dynamic neural networks based on analog solvers have been exploited comprehensively in computational intelligence fields [12,[14][15][16].Different from a number of conventional numerical methods, approaches based on dynamic neural networks can be more realizable on specific parallel and distributed software or/and hardware architectures [17,18].This could highly enlarge utility of current neural networks towards various potential application domains towards high-performance computing.One basic type of dynamic neural networks, recurrent neural networks, which is analogous to the natural transient and steady process, has been applied in online parallel computing tasks with largescale analog/digital circuit prototypes [19].
Our main contribution in this brief is to develop a general framework of recurrent neural network model to solve GLME (1).Since nonlinear phenomena occur frequently in neural network hardware implementation [19], the proposed general nonlinear framework may be more suitable for analog-based 2 Complexity computation.The neural state of the general recurrent neural network can globally converge to the theoretical solutions.If the general recurrent neural network is activated by the linear function, the exponential convergence can be achieved.On the other hand, certain nonlinear forms of such general neural network may be able to obtain more accurate solutions and faster convergence as compared with its linear model, so we propose two specific nonlinear activation functions for the general recurrent neural network model to achieve superior performance to solve GLME (1).

General Recurrent Neural Network Solver
In this section, we present and analyze the general model of recurrent neural network to solve GLME (1).If such model is activated by the linear function, state matrix () ∈  × of the general recurrent neural network can globally and exponentially converge to the unique theoretical solution  * ∈  × .By exploiting specific nonlinear odd monotonically increasing activation functions, superior convergence is expected to be achieved.In the ensuing subsections, we will discuss their convergence properties of the general nonlinear recurrent neural network model together with its linear form.

General Nonlinear Neural Network Model.
In this brief, the general nonlinear recurrent neural network model is proposed to solve GLME (1) as follows: where operator F(⋅) :  × →  × denotes a nonlinear activation function array, with its each scalar-valued mapping unit (⋅) :  →  being a monotonically increasing odd activation function, and subscript (⋅)  denotes transpose of matrix/vector.Such recurrent neural network model (2) can be generalized as the extended nonlinear version of recurrent neural network in [16] and the ensuing linear model.For the general nonlinear recurrent neural network model (2), we would have the following theorem.
Proof.Firstly, we define the distance between the neural state and the theoretical solution as X() = () −  * ∈  × .Accordingly, by substituting () = X() +  * ∈  × into neural network model (2), it can be further equivalently transformed as Next, the corresponding Lyapunov-function candidate is defined as follows: where operators ‖⋅‖  , ‖⋅‖ 2 , and ⊗, respectively, denote Frobenius norm of matrix, two norms of vector, and Kronecker product between matrices and vec( X()) ∈   generates a new column vector obtained by stacking all column vectors of X() ∈  × together.The time derivative of ( X(), ) Considering the following vectorization equality based on (2), we can further derive (5) as For nonlinear activation function array F(⋅), its individual scalar-valued entry (⋅) is odd and monotonically increasing, which can guarantee Thus, V( X(), ) (3) ⩽ 0, which implies that X() ∈  × could globally converge to zero matrix 0 ∈  × according to Lyapunov theory [20]; that is, state matrix () ∈  × of (2) globally converges to the theoretical solution(s)  * ∈  × of GLME (1).All of these above complete the proof.
According to Theorem 1, the general nonlinear neural network model (2) can be activated by a number of odd monotonically increasing functions to solve GLME (1) which is with existent theoretical solutions (unique or multiple), which will broadly enlarge the utility domain of (2) towards manifold model generation.As we may know, nonlinear elements are frequently encountered in analog/digital circuit prototypes of neural networks [19,21]; involving nonlinear activation functions can be beneficial to potential design and implication.On the other hand, faster convergence is indeed required for solving GLME (1) when the linear model Complexity 3 might not satisfy increasing computational requirements.With expectancy, the nonlinear neural network model ( 2) can attain superior convergence to that of (10) if proper activation functions are exploited.Before inducing the superior nonlinear-function activated models, we herein address the linear model of the general nonlinear recurrent neural network and discuss its convergence property.

Linear Neural Network Model.
To solve GLME (1), we firstly define a scalar-valued error function /2 ∈ [0, +∞) associated with (1), where operator ‖ ⋅ ‖  denotes the Frobenius norm.In order to eliminate error function () to zero as  increases, gradientdescent manner is adopted: where design parameter  > 0 scales the convergence rate.
According to preliminaries on matrix-differential theory [22], ( 9) is further expanded to the following dynamic form: For linear model (10), we would have the following theorem.10) is employed to solve GLME (1), starting from initial condition (0) ∈  × , the state matrix () ∈  × of ( 10) can globally exponentially converge to unique theoretical solution  * ∈  × .
It is worth noting that if GLME (1) is with multiple theoretical solutions  * ∈  × , scalar  equals zero.In this situation, the linear model (10) at least could guarantee its global convergence but is not able with explicit exponential convergence rate.

Superior Convergence with Specific Nonlinear Activation Functions
According to Theorem 1, The odd monotonically increasing activation function is able to guarantee global convergence of the general recurrent neural network (2).If the linear activation function is adopted, the general recurrent neural network model reduces to the linear model (10).Such linear model (10) possesses global exponential convergence property.In order to achieve superior convergence to global exponential convergence of the linear model (10), specific types of nonlinear activation functions should be chosen properly.Owing to the aforementioned considerations, two types of nonlinear activation functions, power sum and hyperbolic sine functions, are proposed to activate the general recurrent neural network model (2). Figure 1 shows the curve plotting of the three aforementioned activation functions used in (2).Correspondingly, we will have the following theorems on the two neural network models' convergence properties.2) is activated by the power sum function () = ∑  =1  2−1 , the state matrix () ∈  × of ( 2) can globally and superiorly converge to the unique theoretical solution  * ∈  × , as compared with linear model (10).

Theorem 4. If the general recurrent neural network (
Proof.Similarly, the following Lyapunov function is defined to investigate convergence: and its time derivative is which indicates that when the hyperbolic sine activation function is employed, the nonlinear recurrent neural network model (2) possesses global convergence as its state matrix is approaching zero, with larger Lyapunov-function vanishing rate (i.e., faster convergence) as compared with the situation of linear model (10).These complete the proof.

Illustrative Examples
In this section, three examples are presented to illustrate the efficiency of the general nonlinear recurrent neural network (2) with its specific models under different types of activation functions (linear, power sum, and hyperbolic sine activation functions) for online solving GLME (1).
where From Figure 2, we could observe that the solution errors ‖()− * ‖  decline to almost zero at around 0.02 s and faster convergence to the solution can be achieved with power sum and hyperbolic sine activation functions used in (2).These can demonstrate the effectiveness of the general recurrent neural network model (2) for solving GLME (24).
Example 2. Let us consider the following GLME with multiple theoretical solutions  * ∈  2×2 : where We use linear model (10) with design parameter  = 1 to solve GLME (27).The trajectories of entries of state matrix () ∈  2×2 are shown in Figure 3. From Figure 3, we could see that, starting from two different initial matrices (0) ∈  2×2 , the state matrices () ∈  2×2 of linear model (10), respectively, converge to two different trajectories (or say two different theoretical solutions  * ∈  2×2 ).This indicates that the choices of the initial value impact greatly the steady-state results of the recurrent neural network (2) and determine the starting points of convergence for solution of GLME (27), if multiple theoretical solutions exist for GLME (27).Correspondingly, the residual errors ‖ Ã1 () B1 + Ã2 () B2 − C‖  synthesized by (10) can always diminish to zero within finite time from twenty different initial values, as illustrated by Figure 4.
where coefficient matrices   ∈  10×10 ,   ∈  10×10 , and  ∈  10×10 are all positive-definite randomly generated and fall within interval [−2, 2] ∈  10×10 .We exploit nonlinear neural network models (2) activated by power sum and hyperbolic sine functions and the linear model (10) to solve GLME (29) with design parameter  = 1.From Table 1, we could observe that general recurrent neural network models (2) activated by power sum and hyperbolic sine activation functions exhibit faster error diminishing speed than that of the linear model (10), with all of their residual errors reaching the level of

Conclusion
In this brief, we present a general recurrent neural network model for solving GLME.The general nonlinear model of recurrent neural network possesses global convergence property in finding solutions of GLME.By using specifically proposed nonlinear activation functions, superior convergence can be achieved, as compared with the linear model which is with exponential convergence rate.Illustrative results are shown to demonstrate the effectiveness and superiority of nonlinear recurrent neural network models for solution of GLME.
of coefficient matrices  1 ,  2 ,  1 , and  2 are all positive values.We employ the general recurrent neural network model(2) with  = 1 activated by linear function, power sum function with  = 4, and hyperbolic sine function with  = 3.

Table 1 :
Performance of the general recurrent neural network model (2) with three different activation functions (linear, power sum, and hyperbolic sine) for solving GLME (29).