Design Feed Forward Neural Network to Solve Singular Boundary Value Problems

The aim of this paper is to design feed forward neural network for solving second-order singular boundary value problems in ordinary differential equations.The neural networks use the principle of back propagationwith different training algorithms such as quasi-Newton, Levenberg-Marquardt, andBayesianRegulation. Two examples are considered to show that effectiveness of using the network techniques for solving this type of equations.The convergence properties of the technique and accuracy of the interpolation technique are considered.


Introduction
The study of solving differential equations using artificial neural network (Ann) was initiated by Agatonovic-Kustrin and Beresford in [1].Lagaris et al. in [2] employed two networks, a multilayer perceptron and a radial basis function network, to solve partial differential equations (PDE) with boundary conditions defined on boundaries with the case of complex boundary geometry.Tawfiq [3] proposed a radial basis function neural network (RBFNN) and Hopfield neural network (unsupervised training network).Neural networks have been employed before to solve boundary and initial value problems.Malek and Shekari Beidokhti [4] reported a novel hybrid method based on optimization techniques and neural networks methods for the solution of high order ODE which used three-layered perceptron network.Akca et al. [5] discussed different approaches of using wavelets in the solution of boundary value problems (BVP) for ODE, also introduced convenient wavelet representations for the derivatives for certain functions, and discussed wavelet network algorithm.Mc Fall [6] presented multilayer perceptron networks to solve BVP of PDE for arbitrary irregular domain where he used logsig.transfer function in hidden layer and pure line in output layer and used gradient decent training algorithm; also, he used RBFNN for solving this problem and compared between them.Junaid et al. [7] used Ann with genetic training algorithm and log sigmoid function for solving first-order ODE.Abdul Samath et al. [8] suggested the solution of the matrix Riccati differential equation (MRDE) for nonlinear singular system using Ann.Ibraheem and Khalaf [9] proposed shooting neural networks algorithm for solving two-point second-order BVP in ODEs which reduced the equation to the system of two equations of first order.Hoda and Nagla [10] described a numerical solution with neural networks for solving PDE, with mixed boundary conditions.Majidzadeh [11] suggested a new approach for reducing the inverse problem for a domain to an equivalent problem in a variational setting using radial basis functions neural network; also he used cascade feed forward to solve two-dimensional Poisson equation with back propagation and Levenberg-Marquardt train algorithm with the architecture three layers and 12 input nodes, 18 tansig.transfer functions in hidden layer, and 3 linear nodes in output layer.Oraibi [12] designed feed forward neural networks (FFNNs) for solving IVP of ODE.Ali [13] designed fast FFNN to solve two-point BVP.This paper proposed FFNN to solve two point singular boundary value problem (TPSBVP) with back propagation (BP) training algorithm.

Singular Boundary Value Problem
The general form of the 2nd-order two-point boundary value problem (TPBVP) is   +  ()   +  ()  = 0,  ≤  ≤   () = ,  () = , where ,  ∈ . ( there are two types of a point  0 ∈ [0, 1]: ordinary point and singular point. A function () is analytic at  0 if it has a power series expansion at  0 that converges to () on an open interval containing  0 .A point  0 is an ordinary point of the ODE (1), if the functions () and () are analytic at  0 .Otherwise  0 is a singular point of the ODE.On the other hand, if () or () are not analytic at  0 , then  0 is said to be a singular point [14,15].
There is at present no theoretical work justifying numerical methods for solving problems with irregular singular points.The main practical occurrence of such problems seems to be semianalytic technique [16].

Artificial Neural Network
Ann is a simplified mathematical model of the human brain.It can be implemented by both electric elements and computer software.It is a parallel distributed processor with large numbers of connections; it is an information processing system that has certain performance characters in common with biological neural networks [17].
The arriving signals, called inputs, multiplied by the connection weights (adjusted) are first summed (combined) and then passed through a transfer function to produce the output for that neuron.The activation (transfer) function acts on the weighted sum of the neuron's inputs and the most commonly used transfer function is the sigmoid function (tansig.)[13].
There are two main connection formulas (types): feedback (recurrent) and feed forward connections.Feedback is one type of connection where the output of one layer routes back to the input of a previous layer, or to the same layer.Feed forward neural network (FFNN) does not have a connection back from the output to the input neurons [18].
There are many different training algorithms, but the most often used training algorithm is the Delta rule or back propagation (BP) rule.A neural network is trained to map a set of input data by iterative adjustment of the weights.Information from inputs is fed forward through the network to optimize the weights between neurons.Optimization of the weights is made by backward propagation of the error during training phase.
The Ann reads the input and output values in the training data set and changes the value of the weighted links to reduce the difference between the predicted and target (observed) values.The error in prediction is minimized across many training cycles (iteration or epoch) until network reaches specified level of accuracy.A complete round of forwardbackward passes and weight adjustments using all inputoutput pairs in the data set is called an epoch or iteration.
If a network is left to train for too long, however, it will be overtrained and will lose the ability to generalize.
In this paper, we focused on the training situation known as supervised training, in which a set of input/output data patterns is available.Thus, the Ann has to be trained to produce the desired output according to the examples.
In order to perform a supervised training we need a way of evaluating the Ann output error between the actual and the expected outputs.A popular measure is the mean squared error (MSE) or root mean squared error (RMSE) [19].

Description of the Method
In the proposed approach the model function is expressed as the sum of two terms: the first term satisfies the boundary conditions (BCs) and contains no adjustable parameters.The second term can be found by using FFNN which is trained so as to satisfy the differential equation and such technique called collocation neural network.
In this section, we will illustrate how our approach can be used to find the approximate solution of the general form, a 2nd-order TPSBVP: where a subject to certain BCs and  ∈ ,  ∈ ,  ⊂  denotes the domain and () is the solution to be computed.
If   (, ) denotes a trial solution with adjustable parameters , the problem is transformed to a discretize form: subject to the constraints imposed by the BCs.
In the our proposed approach, the trial solution   employs an FFNN and the parameters  correspond to the weights and biases of the neural architecture.We choose a form for the trial function   () such that it satisfies the BCs.This is achieved by writing it as a sum of two terms: where (, ) is a single-output FFNN with parameters  and  input units fed with the input vector .The term () contains no adjustable parameters and satisfies the BCs.The second term  is constructed so as not to contribute to the BCs, since   () satisfy them.This term can be formed by using an FFNN whose weights and biases are to be adjusted in order to deal with the minimization problem.
An efficient minimization of (3) can be considered as a procedure of training the FFNN, where the error corresponding to each input   is the value (  ) which has to forced near zero.Computation of this error value involves not only the FFNN output but also the derivatives of the output with respect to any of its inputs.
Therefore, in computing the gradient of the error with respect to the network weights consider a multilayer FFNN with  input units (where  is the dimensions of the domain), one hidden layer with  sigmoid nodes, and a linear output unit.For a given input , the output of the FFNN is denotes the weight connecting the input unit  to the hidden unit , ]  denotes the weight connecting the hidden unit  to the output unit,   denotes the bias of hidden unit , and () is the sigmoid transfer function (tansig.).The gradient of FFNN with respect to the parameters of the FFNN can be easily obtained as Once the derivative of the error with respect to the network parameters has been defined, then it is a straight forward to employ any minimization technique.It must also be noted that the batch mode of weight updates may be employed.

Illustration of the Method
In this section we describe solution of TPSBVP using FFNN.
To illustrate the method, we will consider the 2nd-order TPSBVP: where  ∈ [, ] and the BC: () = , () = ; a trial solution can be written as where (, ) is the output of an FFNN with one input unit for  and weights .Note that   () satisfies the BC by construction.The error quantity to be minimized is given by where the it is straightforward to compute the gradient of the error with respect to the parameters  using (6).The same holds for all subsequent model problems.

Example
In this section we report numerical result, using a multi-layer FFNN having one hidden layer with 5 hidden units (neurons) and one linear output unit.The sigmoid activation of each hidden unit is tansig.; the analytic solution   () was known in advance.Therefore we test the accuracy of the obtained solutions by computing the deviation: In order to illustrate the characteristics of the solutions provided by the neural network method, we provide figures displaying the corresponding deviation Δ() both at the few points (training points) that were used for training and at many other points (test points) of the domain of equation.The latter kind of figures are of major importance since they show the interpolation capabilities of the neural solution which to be superior compared to other solution obtained by using other methods.Moreover, we can consider points outside the training interval in order to obtain an estimate of the extrapolation performance of the obtained numerical solution.
Example 1.Consider the following 2nd-order TPSBVP: with BC:   (0) = 0, (1) = cos(1).The analytic solution is   () = cos(); according to (8) the trial neural form of the solution is taken to be The FFNN trained using a grid of ten equidistant points in [0, 1]. Figure 1 displays the analytic and neural solutions with different training algorithm.The neural results with different types of training algorithm such as Levenberg-Marquardt (trainlm), quasi-Newton (trainbfg), and Bayesian Regulation (trainbr) introduced in Table 1 and its errors given in Table 2, Table 3 gives the performance of the train with epoch and time, and Table 4 gives the weight and bias of the designer network, Ramos in [20] solved this example using  1 -linearization method and gave the absolute error 4.079613 − 04; also, Kumar in [21] solved this example by the three-point finite difference technique and gave the absolute error 4.4 − 05.
with BC: (0) = 1, (1) = exp(1) and the analytic solution is   () = exp(); according to (8) the trial neural form of the solution is The FFNN trained using a grid of ten equidistant points in [0, 1]. Figure 2 displays the analytic and neural solutions with different training algorithms.The neural network results with different types of training algorithm such as trainlm, trainbfg, and trainbr, introduced in Table 5 and its errors given in Table 6, Table 7 gives the performance of the train with epoch and time and Table 8 gives the weight and bias of the designer network.

Conclusion
From the previous mentioned problems it is clear that the proposed network can be handle effectively TPSBVP and provide accurate approximate solution throughout the whole domain and not only at the training points.As evident from the tables, the results of proposed network are more precise as compared to the method suggested in [20,21].
In general, the practical results on FFNN show that the Levenberg-Marquardt algorithm (trainlm) will have the fastest convergence, then trainbfg and then Bayesian Regulation (trainbr).However, "trainbr" does not perform well on function approximation problems.The performance of the various algorithms can be affected by the accuracy required of the approximation.

Figure 2 :
Figure 2: Analytic and neural solutions of Example 2 using different training algorithms.

Table 1 :
Analytic and neural solutions of Example 1.

Table 3 :
The performance of the train with epoch and time for Example 1.

Table 4 :
Weight and bias of the network for different training algorithm Example 1.

Table 5 :
Analytic and neural solutions of Example 2.

Table 7 :
The performance of the train with epoch and time of Example 2.

Table 8 :
Weight and bias of the network for different training algorithm Example 2.