Comparison of Artificial Neural Network Architecture in Solving Ordinary Differential Equations

This paper investigates the solution of Ordinary Differential Equations (ODEs) with initial conditions using Regression Based Algorithm (RBA) and compares the results with arbitraryand regression-based initial weights for different numbers of nodes in hidden layer.Here, we have used feed forward neural network and error back propagationmethod forminimizing the error function and for themodification of the parameters (weights and biases). Initial weights are taken as combination of random as well as by the proposed regression based model. We present the method for solving a variety of problems and the results are compared. Here, the number of nodes in hidden layer has been fixed according to the degree of polynomial in the regression fitting. For this, the input and output data are fitted first with various degree polynomials using regression analysis and the coefficients involved are taken as initial weights to start with the neural training. Fixing of the hidden nodes depends upon the degree of the polynomial. For the example problems, the analytical results have been compared with neural results with arbitrary and regression based weights with four, five, and six nodes in hidden layer and are found to be in good agreement.


Introduction
Differential equations play vital role in various fields of engineering and science.The exact solution of differential equations may not be always possible [1].So various types of well known numerical methods such as Euler, Rungekutta, Predictor-Corrector, finite element, and finite difference methods, are used for solving these equations.Although these numerical methods provide good approximations to the solution, but these may be challenging for higher dimension problems.In recent years, many researchers tried to find new methods for solving differential equations.As such here Artificial Neural Network (ANN) based models are used to solve ordinary differential equations with initial conditions.
Lee and Kang [2] first introduced a method to solve first order differential equation using Hopfield neural network models.Then, another approach by Meade and Fernandez [3,4] has been proposed for both linear and nonlinear differential equations using  1 -splines and feed forward neural network.Artificial neural networks based on Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization technique for solving ordinary and partial differential equations have been excellently presented by Lagaris et al. [5].Also Lagaris et al. [6] investigated neural network methods for boundary value problems with irregular boundaries.Parisi et al. [7] presented unsupervised feed forward neural network for the solution of differential equations.The potential of the hybrid and optimization technique to deal with differential equation of lower order as well as higher order has been presented by Malek and Shekari Beidokhti [8].Choi and Lee [9] discussed comparison of generalizing ability on solving differential equation using back propagation and reformulated radial basis function network.Yazdi et al. [10] used unsupervised kernel least mean square algorithm for solving ordinary differential equations.A new algorithm for solving matrix Riccati differential equations has been developed by Selvaraju and Abdul Samant [11].He et al. [12] investigated a class of partial differential equations using multilayer neural network.Kumar and Yadav [13] surveyed multilayer perceptrons and radial basis function neural network methods for the solution of differential equations.Tsoulos et al. [14] solved differential equations with neural networks using Advances in Artificial Neural Systems a scheme based on grammatical evolution.Numerical solution of elliptic partial differential equation using radial basis function neural networks has been presented by Jianyu et al. [15].Shirvany et al. [16] proposed multilayer perceptron and radial basis function (RBF) neural networks with a new unsupervised training method for numerical solution of partial differential equations.Mai-Duy and Tran-Cong [17] discussed numerical solution of differential equations using multiquadric radial basis function networks.Fuzzy linguistic model in neural network to solve differential equations is presented by Leephakpreeda [18].Franke and Schaback [19] solved partial differential equations by collocation using radial basis functions.Smaoui and Al-Enezi [20] presented the dynamics of two nonlinear partial differential equations using artificial neural networks.Differential equations with genetic programming have been analyzed by Tsoulos and Lagaris [21].McFall and Mahan [22] used artificial neural network for solution of boundary value problems with exact satisfaction of arbitrary boundary conditions.Hoda and Nagla [23] solved mixed boundary value problems using multilayer perceptron neural network method.
As per the review of the literatures, it reveals that authors have taken the parameters (weights/biases) as arbitrary (random) and the numbers of nodes in hidden layer are considered by trial and error method.In this paper, we propose a method for solving ordinary differential equations using feed forward neural network as a basic approximation element and error back propagation algorithm [24,25] by fixing hidden nodes as per the required accuracy.The trial solution of the model is generated by training the algorithm.The approximate solution by ANN has many benefits compared with traditional numerical methods.The ANN trial solution is written as sum of two terms, first one satisfies initial/boundary conditions and the second part involves regression based neural network with adjustable parameters.The computational complexity does not increase considerably with the number of sampling points.The method is general so it can be applied to solve linear and nonlinear ordinary and partial differential equations.The modification of parameters has been done without direct use of optimization technique.For which computation of the gradient of error with respect to the network parameters is required.A regression based artificial neural network with combinations of initial weights (arbitrary and regression based) in the connections is first proposed by Chakraverty et al. [26] and then by Singh et al. [27].Here, number of nodes in hidden layer may be fixed according to the degree of polynomial required for the accuracy.We have considered a first order and an application problem such as damped free vibration problem to show the comparison of different ANN models.Mall and Chakraverty [28] proposed regressionbased neural network model for solving ordinary differential equations.
Rest of the paper is organized as follows.In Section 2, we describe the general formulation of the proposed approach and computation of gradient of the error function.Section 3 gives details of problem formulation and construction of the appropriate form of trial solution.The proposed regression based artificial neural network method has been presented in Section 4. Numerical examples and its results are presented in Section 5.In this section, we compare arbitrary and regression based weight results and those are shown graphically.Section 6 incorporates the discussion and analysis part.Lastly conclusion is outlined in Section 7.

General Formulation for Differential Equations
Let us consider the following general differential equations which represent both ordinary and partial differential equations [4]: subject to some initial or boundary conditions, where  = ( 1 ,  2 , . . .,   ) ∈   ,  ⊂   denotes the domain, and () is the solution to be computed.Here,  is the function which defines the structure of the differential equation and ∇ is a differential operator.For the solution of the differential equation, a discretized domain  over finite set of points in  is considered.Thus, the problem transformed into the system of equations as follows: Let   (, ) denote the trail solution with adjustable parameters (weights, biases) , and then the problem may be formulated as where () satisfies initial or boundary condition and contains no adjustable parameters, whereas (, ) is the output of feed forward neural network with the parameters  and input data .The second term (, (, )) makes no contribution to initial or boundary but this is used to a neural network model whose weights and biases are adjusted to minimize the error function.

2.1.
Computation of the Gradient.The error computation not only involves the outputs but also the derivatives of the network output with respect to its inputs.So, it requires finding out the gradient of the network derivatives with respect to its inputs.Let us now consider a multilayered perceptron with one input node, a hidden layer with  nodes (fixed number of nods as proposed), and one output unit.For the given inputs  = ( 1 ,  2 , . . .  ), the output is given by where   = ∑  =1     +   ,   denotes the weight from input unit  to the hidden unit , V  denotes weight from the hidden unit  to the output unit,   denotes the biases, and (  ) is the sigmoid activation function.
The derivatives of (, ) with respect to input   is where  = (  ) and  () denotes the th order derivative of sigmoid function.
Let   denote the derivative of the network with respect to its inputs and then we have the following relation [4]: where The derivative of   with respect to other parameters may be obtained as

Formulation of First Order Ordinary Differential Equation
Let us consider first order ordinary differential equation as below with initial condition () = .
In this case, the ANN trail solution may be written as where (, ) is the neural output of the feed forward network with one input data  with parameters .The trial solution   (, ) satisfies the initial condition.We differentiate the trial solution Ψ  (, ) to get For evaluating the derivative term in the right hand side of (15), we use ( 5)- (11).
The error function for this case may be formulated as The weights from input to hidden are modified according to the following rule where Here,  is the learning rate and  is the iteration step.The weights from hidden to output layer may be updated in a similar formulation as done for input to hidden.

Formulation of Second Order Ordinary Differential Equation.
In this case, the second order ordinary differential equation may be written in general as with initial conditions () = ,   () =   .The ANN trail solution may be discussed as where (, ) is the neural output of the feed forward network with one input data  with parameters  and the trial solution   (, ) satisfies the initial conditions.The error function to be minimized for second order ordinary differential equation will be Next, the following weight updating rule is applied for weights from input to hidden connections: where Again, we update the weights from hidden to output layer, as discussed for input to hidden.

Proposed Regression-Based Algorithm
Three layer architecture of ANN for the present problem is considered.Usually numbers of nodes in the hidden layer are taken by trial and error method.Here, we fix the number of nodes in hidden layer by using regression-based weight generation [24,25].Figure 1 shows the proposed model, in which the input layer consist of single input unit and the output layer consist of one output unit.Numbers of nodes in the hidden layer are fixed according to degree of polynomial to be considered.If th degree polynomial is considered, then the number of nodes in hidden layer will be  + 1 and coefficients (constants) of the polynomial may be considered as initial weights from input to hidden as well as hidden to output layers or any combination of random and regression based weight.Network architecture with five degree polynomial has been shown in Figure 1, the six coefficients (constants) are taken as initial weights in two stages from input to hidden and hidden to output layer.The constants of the polynomial, that is,   are taken as initial weights and six nodes for the six constants in the hidden layer are considered.

Numerical Examples
In this section, we present solution of two example problems as mentioned earlier.In all cases, we have used error back propagation algorithm and one hidden layer.The weights are taken as arbitrary and regression based for comparison of the training method.Sigmoid function () = 1/(1 +  − ) is considered as an activation function for hidden unit.
Example 1.Let us consider the first order ordinary differential equation as follows: with initial condition (0) = 1.
The trial solution is written as We have trained the network for 20 equidistant points in [0, 1] and compared results between analytical and neural with arbitrary and regression based weights with four, five, and six nodes fixed in hidden layer.Comparison between analytical and neural results with arbitrary and regression based weights is given in Table 1.Analytic results are incorporated in second column.Neural results for arbitrary weights () (from input to hidden layer) and V() (from hidden to output layer) with four, five, and six nodes are cited in third, fifth, and seventh column, respectively.Similarly neural results with regression weights () (from input to hidden layer) and V() (from hidden to output layer) with four, five, and six nodes are given in fourth, sixth, and ninth column, respectively.Analytical and neural results with arbitrary and regression based weights for six nodes in hidden layer are compared in Figures 2 and 3.The error plot is shown in Figure 4. Absolute deviations in % values have been calculated in Table 1 and the maximum deviation for arbitrary weights neural results (six hidden nodes) is 3.67 (eighth column) and for regression based weights it is 1.47 (tenth column).From Figures 2 and 3, one may see that results from the regressionbased weights agree exactly at all points with analytical results but for results with arbitrary weights they are not so.Thus, one may see that the neural results with regression based weights are more accurate.
It may be seen that by increasing the number of nodes in hidden layer from four to six, the results are found to be better.Although the authors increased the number of nodes in hidden layer beyond six, but the results were not improving.
The first problem has also been solved by a well-known numerical method, namely, using Euler and Runge-kutta method.Table 2 shows comparison between the neural results (with six hidden nodes) and other numerical results (Euler and Runge-Kutta results).
As discussed above, we can write the trail solution as Then, the network is trained for 40 equidistant points in [0, 4] and with four, five, and six hidden nodes according to arbitrary and regression-based algorithm.In Table 3, we compare the analytical solutions with neural solutions taking arbitrary-and regression-based weights for four, five, and six nodes in hidden layer.Here, analytic results are cited in second column of Table 3. Neural results for arbitrary weights () (from input to hidden layer) and V() (from hidden to output layer) with four, five, and six nodes are shown in third, fifth, and seventh column, respectively.Neural results with regression-based weights () (from input to hidden layer) and V() (from hidden to output layer) with four, five and six nodes are cited in fourth, sixth, and eighth column, respectively.Analytical and neural results which are obtained for random initial weights are depicted in Figure 5. Figure 6 shows comparison between analytical and neural results for regression-based initial weights for six hidden nodes.Finally, the error plot between analytical and RBNN results are shown in Figure 7.
Example 3. Now we consider an initial value problem as follows: subject to (0) = 0.
The ANN trial solution is written as Ten equidistant points in the given domain which are taken with four, five, and six hidden nodes according to arbitrary and regression-based algorithms have been considered.Analytical and traditional neural results obtained using random initial weights with six nodes are depicted in Figure 8.Similarly, Figure 9 shows comparison between analytical and neural results with regression-based initial weights for six hidden nodes.Finally, the error plot between analytical and RBNN results are cited in Figure 10.

Comparison of analytical and neural results with arbitraryand regression-based weights have been shown in
with initial condition (0) = 1.
Here 1/ represents time constant or characteristic time.Analytic result may be found as Considering  = 1, we have the analytical solution as  =   .The ANN trial solution in this case is Now, the network is trained for ten equidistant points in the domain [0, 1] with four, five, and six hidden nodes according to arbitrary-and regression-based algorithm.Comparison of analytical and neural results with arbitrary-((), V()) and regression-based weights ((), V()) has been given in Table 5. Analytical and traditional neural results obtained using random initial weights with six nodes are shown in Figure 11. Figure 12 depicts comparison between analytical and neural results with regression-based initial weights for six hidden nodes.Error plot between analytical and RBNN results is cited in Figure 13.

Analytical results
Neural results with random weights

Discussion and Analysis
In traditional artificial neural network, the parameters (weights/biases) are usually taken as arbitrary (random) and the number of nodes in hidden layer is considered by trial and error method.Also, few authors have used optimization

Analytical results
Neural results with random weights technique to minimize the error.In this investigation, a regression-based artificial neural network with combinations of initial weights (arbitrary and regression based) in the connections is considered.We have fixed the number of nodes in hidden layer according to the degree of polynomial of regression fitting.The initial weights from input to hidden

Analytical results
Neural results with random weights 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x and hidden to output layer are taken by using regressionbased weight generation.Back propagation algorithm has been employed for modification of the parameters without use of any optimization technique.Also, time of computation is less than traditional artificial neural architecture.Table 6 shows the computation of training time in hours with four, five, and six hidden nodes.It is well known that the other numerical methods are usually iterative in nature, where we fix the step size before the start of the computation.After the solution is obtained, if we want to know the solution in between steps, then again the procedure is to be repeated from initial stage.ANN may be one of the reliefs where we may overcome this repetition of iterations.The authors are not claiming that the method presented is most accurate.As it may be seen by the comparison in Tables 2 and 4 that Runge-Kutta method although it gives better result but the above repetitive nature is required for each step size.Here, after getting the converged ANN, we may use it as a black box to get numerical results of any arbitrary point in the domain.
Here, we have considered three, four, and five degree polynomial for regression fitting.One may consider higher degree polynomial in the simulation but it has been seen that by increasing the degree of the polynomials, the accuracy does not usually increase.In the future, it needs to develop a methodology about what degree polynomial one should use to get a result with acceptable accuracy.This is however not of the scope of this paper and the authors are working in this direction and hope to communicate the findings in the future.

Analytical results
Neural results with random weights

Conclusion
This paper presents a new approach to solve ordinary differential equations by using regression based artificial neural network model.Accuracy of the proposed method has been examined by solving a first order and a second order damped free vibration problem.The main value of the paper is that the numbers of nodes in hidden layer are fixed according to the degree of polynomial in the regression.Accordingly, here, comparisons of different neural architectures corresponding to different regression models are investigated.Moreover, the algorithm is unsupervised and error back propagation algorithm is used to minimize the error function.Corresponding initial weights from input to hidden and hidden to output are all obtained by the proposed procedure.The trail solution is closed and differentiable.One may see from the tables and graphs that the initial weights generated by regression model make the results more accurate.Lastly, it may be mentioned that the implemented Regression Based Neural Network (RBNN) algorithm is simple, computationally efficient, and straight forward.

Figure 1 :
Figure 1: Three-layered neural network architecture with single input and single output node.

Figure 8 :
Figure 8: Plot of comparison between (analytical results) and (neural results) with arbitrary weights (for six nodes) (Example 3).

Figure 11 :
Figure 11: Plot of comparison between (analytical results) and (neural results) with arbitrary weights (for six nodes) (Example 4).

Table 1 :
Analytical and neural solutions with arbitrary and regression based weights (Example 1).

Table 4 .
Also, other numerical results, namely, Euler and Runge-Kutta results are compared with neural results in this table.

Table 2 :
Comparison of the results (Example 1).

Table 3 :
Analytical and neural solutions with arbitrary-and regression-based weights (Example 2).

Table 4 :
Analytical and neural solutions with arbitrary-and regression-based weights (Example 3).

Table 5 :
Analytical and neural solutions with arbitrary-and regression-based weights (Example 4).

Table 6 :
Time of computation.