Using Feed Forward Neural Network to Solve Eigenvalue Problems

The aim of this paper is to presents a parallel processor technique for solving eigenvalue problem for ordinary differential equations using artificial neural networks. The proposed network is trained by back propagation with different training algorithms quasiNewton, Levenberg-Marquardt, and Bayesian Regulation. The next objective of this paper was to compare the performance of aforementioned algorithms with regard to predicting ability.


Introduction
These days every process is automated.A lot of mathematical procedures have been automated.There is a strong need of software that solves differential equations (DEs) as many problems in science and engineering are reduced to differential equations through the process of mathematical modeling.Although model equations based on physical laws can be constructed, analytical tools are frequently inadequate for the purpose of obtaining their closed form solution and usually numerical methods must be resorted to.
The application of neural networks for solving differential equations can be regarded as a mesh-free numerical method.It has been proved that feed forward neural networks with one hidden layer are capable of universal approximation, for problems of interpolation and approximation of scattered data.

Related Work
Neural networks have found application in many disciplines: neurosciences, mathematics, statistics, physics, computer science, and engineering.In the context of the numerical solution of differential equations, high-order derivatives are undesirable in general because they can introduce large approximation error.The use of higher order conventional Lagrange polynomials does not guarantee to yield a better quality (smoothness) of approximation.Many methods have been developed so far for solving differential equations; some of them produce a solution in the form of an array that contains the value of the solution at a selected group of points [1].Others use basis functions to represent the solution in analytic form and transform the original problem usually to a system of algebraic equations [2].Most of the previous study in solving differential equations using artificial neural network (Ann) is restricted to the case of solving the systems of algebraic equations which result from the discretisation of the domain [3].Most of the previous works in solving differential equations using neural networks is restricted to the case of solving the linear systems of algebraic equations which result from the discretisation of the domain.The minimization of the networks energy function provides the solution to the system of equations [4].Lagaris et al. [5] [6] compared weight reuse for two existing methods of defining the network error function; weight reuse is shown to accelerate training of ODE; the second method outperforms the fails unpredictably when weight reuse is applied to accelerate solution of the diffusion equation.Tawfiq [7] proposed a radial basis function neural network (RBFNN) and Hopfield neural network (unsupervised training network) as a designer network to solve ODE and PDE and compared between them.Malek and Shekari Beidokhti [8] reported a novel hybrid method based on optimization techniques and neural networks methods for the solution of high order ODE which used three layered perceptron network.Akca et al. [9] discussed different approaches of using wavelets in the solution of boundary value problems (BVP) for ODE and also introduced convenient wavelet representations for the derivatives for certain functions and discussed wavelet network algorithm.Mc Fall [10] presented multilayer perceptron networks to solve BVP of PDE for arbitrary irregular domain where he used logsig.Transfer function in hidden layer and pureline in output layer and used gradient decent training algorithm; also, he used RBFNN for solving this problem and compared between them.Junaid et al. [11] used Ann with genetic training algorithm and log sigmoid function for solving first order ODE; Zahoor et al. [12] has been using an evolutionary technique for the solution of nonlinear Riccati differential equations of fractional order and the learning of the unknown parameters in neural network has been achieved with hybrid intelligent algorithms mainly based on genetic algorithm (GA).Abdul Samath et al. [13] suggested the solution of the matrix Riccati differential equation (MRDE) for nonlinear singular system using Ann.Ibraheem and Khalaf [14] proposed shooting neural networks algorithm for solving two-point second order BVP in ODEs which reduced the equation to the system of two equations of first order.Hoda and Nagla [4] described a numerical solution with neural networks for solving PDE, with mixed boundary conditions.Majidzadeh [15] suggested a new approach for reducing the inverse problem for a domain to an equivalent problem in a variational setting using radial basis functions neural network; also he used "cascade feed forward to solve two-dimensional Poisson equationwith back propagation and Levenberg-Marquardt train algorithm with the architecture three layers and 12 input nodes, 18 tansig.transfer function in hidden layer, and 3 linear nodes in output layer.Oraibi [16] designed feed forward neural networks (FFNN) for solving IVP of ODE.Ali [17] design fast FFNN to solve two-point BVP.This paper proposed FFNN to solve two-point singular boundary value problem (TPSBVP) with back propagation (BP) training algorithm.Tawfiq and Hussein [18] suggest multilayer FFNN to solve singular boundary value problems.

What Is Artificial Neural Network?
Ann is a simplified mathematical model of the human brain; it can be implemented by both electric elements and computer software.It is a parallel distributed processor with large numbers of connections; it is an information processing system that has certain performance characters in common with biological neural networks [19].The arriving signals, called inputs, multiplied by the connection weights (adjusted) are first summed (combined) and then passed through a transfer function to produce the output for that neuron.The activation (transfer) function acts on the weighted sum of the neuron's inputs and the most commonly used transfer function is the sigmoid function (tansig) [17].
There are two main connection formulas (types): feedback (recurrent) and feed forward connection.Feedback is one type of connection where the output of one layer routes back to the input of a previous layer, or to same layer.Feed forward (FFNN) does not have a connection back from the output to the input neurons [20].
There are many different training algorithms, but the most often used is the Delta-rule or back propagation (BP) rule.A neural network is trained to map a set of input data by iterative adjustment of the weights.Information from inputs is fed forward through the network to optimize the weights between neurons.Optimization of the weights is made by backward propagation of the error during training phase.
The Ann reads the input and output values in the training data set and changes the value of the weighted links to reduce the difference between the predicted and target (observed) values.The error in prediction is minimized across many training cycles (iteration or epoch) until network reaches specified level of accuracy.A complete round of forwardbackward passes and weight adjustments using all inputoutput pairs in the data set is called an epoch or iteration.If a network is left to train for too long, however, it will be overtrained and will lose the ability to generalize.
In this paper, we focused on the training situation known as supervised training, in which a set of input/output data patterns is available.Thus, the Ann has to be trained to produce the desired output according to the examples.
In order to perform a supervised training we need a way of evaluating the Ann output error between the actual and the expected output.A popular measure is the mean squared error (MSE) or root mean squared error (RMSE) [21].

Proposed Design
System design is the process of breaking a complex topic or substance into smaller parts to gain a better understanding of it.We try to design the EVP Solver using block diagrams.
The following are the actors of this application.
(1) End user: one who interacts with the system.
(2) System: receives commands and actions from the end user and performs required operations.FFNNs allow a conversion of a function from low-dimensional space to high-dimensional space (e.g., 1D-3D) in which the function will be expressed as a linear combination of ridge basis functions.
Provide the EVP of differential equation along with boundary conditions as input through GUI.Based on the EVP generates the data points.Determine the centers with respect to the generated data points.The data points and eigenvalue should be within the solution space.If the data points and eigenvalue are out of the solution space then change the boundary conditions and again find out the data points and eigenvalue.

Description of the Method
In the proposed approach the model function is expressed as the sum of two terms: the first term satisfies the boundary conditions (BC) and contains no adjustable parameters.The second term can be found by using FFNN which is trained so as to satisfy the differential equation and such technique called collocation neural network.
In this section we will illustrate how our approach can be used to find the approximate solution of the general form a 2nd order EVP: where a subject to certain BC's and  ∈ ,  ⊂  denotes the domain, and () is the solution to be computed.
If   (, ) denotes a trial solution with adjustable parameters , the problem is transformed to a discretized form: subject to the constraints imposed by the BC's.
In our proposed approach, the trial solution   employs a FFNN and the parameters  correspond to the weights and biases of the neural architecture.We choose a form for the trial function   () such that it satisfies the BC's.This is achieved by writing it as a sum of two terms: where (, ) is a single-output FFNN with parameters  and  input units fed with the input vector .The term () contains no adjustable parameters and satisfies the BC's.The second term  is constructed so as not to contribute to the BC's, since   () satisfy them.This term can be formed by using a FFNN whose weights and biases are to be adjusted in order to deal with the minimization problem.

Computation of the Gradient
An efficient minimization of (2) can be considered as a procedure of training the FFNN, where the error corresponding to each input   is the value (  ) which has to be forced near zero.Computation of this error value involves not only the FFNN output but also the derivatives of the output with respect to any of its inputs.Therefore, in computing the gradient of the error with respect to the network weights consider a multilayer FFNN with  input units (where  is the dimensions of the domain), one hidden layer with  sigmoid units, and a linear output unit.
For a given input  the output of the FFNN is denotes the weight connecting the input unit  to the hidden unit , ]  denotes the weight connecting the hidden unit  to the output unit,   denotes the bias of hidden unit , and () is the sigmoid transfer function (tansig).
The gradient of FFNN, with respect to the parameters of the FFNN, can be easily obtained as Once the derivative of the error with respect to the network parameters has been defined, then it is straightforward to employ any minimization technique.It must also be noted that the batch mode of weight updates may be employed.

Illustration of the Method
In this section we describe solution of EVP using FFNN.To illustrate the method, we will consider the 2nd order EVP: where  ∈ [, ] and the BC: () = , () =  (Dirichlet case) or   () = ,   () =  (Neumann case) or   () = , () =  (Mixed case).A trial solution can be written as where (, ) is the output of a FFNN with one input unit for  and weights .
Note.   () satisfies the BC by construction.The error quantity to be minimized is given by it is straightforward to compute the gradient of the error with respect to the parameters  using (5).The same holds for all subsequent model problems.

Examples
In this section we report numerical result, using a multilayer FFNN having one hidden layer with 5 hidden units (neurons) and one linear output unit.The sigmoid activation of each hidden unit is tansig; the analytic solution   () was known in advance.Therefore we test the accuracy of the obtained solutions by computing the deviation: In order to illustrate the characteristics of the solutions provided by the neural network method, we provide figures displaying the corresponding deviation Δ() both at the few points (training points) that were used for training and at many other points (test points) of the domain of equation.The latter kind of figures is of major importance since they show the interpolation capabilities of the neural solution which is to be superior compared to other solutions obtained by using other methods.Moreover, we can consider points outside the training interval in order to obtain an estimate of the extrapolation performance of the obtained numerical solution.
Example 1.Consider the following 2nd order EVP: With BC (Dirishlit case), (0) = 1, (1) =  − .The analytic solution is   () =  − 2 ; according to (8) the trial neural form of the solution is taken to be   () = 1 + ( − − 1)  +  ( − 1)  (, ) . ( The FFNN trained using a grid of ten equidistant points in [0, 1] gave  = 150; Figure 1 displays the analytic and neural solutions with different training algorithms.The neural results with different types of training algorithm such as Levenberg-Marquardt (trainlm), quasi-Newton (trainbfg), and Bayesian Regulation (trainbr) are introduced in Table 1 and its errors are given in Table 2; Table 3 gives the performance of the train with epoch and time and Table 4 gives the weight and bias of the designer network.

Conclusion
From the above problems it is clear that the network which is proposed can handle effectively EVP and provide accurate approximate solution throughout the whole domain and not only at the training points.As evident from the tables, the results of proposed network are more precise as compared to method suggested in [22].
In general, the practical results for FFNN show the network which contain up to a few hundred weights with the Levenberg-Marquardt training algorithm (trainlm) having the fastest convergence than the network with trainbfg training algorithm and then the network with trainbr training

Table 1 :
Analytic and neural solution of Example 1.

Table 3 :
The performance of the train with epoch and time.

Table 4 :
Weight and bias of the network for different training algorithms.

Table 6 ;
Table 7 gives the performance of the train with epoch and time and Table 8 gives the weight and bias of the designer network.

Table 5 :
Analytic and neural solution of Example 2.

Table 7 :
The performance of the train with epoch and time.

Table 8 :
Weight and bias of the network for different training algorithms.

Table 9 :
[22]results of Example 2 given in[22]. .However, "trainbr" does not perform well for function approximation on problems.The performance of the various algorithms can be affected by the accuracy required of the approximation. algorithm