Development of Artificial Neural-Network-BasedModels for the Simulation of Spring Discharge

The present study demonstrates the application of artificial neural networks (ANNs) in predicting the weekly spring discharge. The study was based on the weekly spring discharge from a spring located near Ranichauri in Tehri Garhwal district of Uttarakhand, India. Five models were developed for predicting the spring discharge based on a weekly interval using rainfall, evaporation, temperature with a specified lag time. All models were developed both with one and two hidden layers. Each model was developed with many trials by selecting different network architectures and different number of hidden neurons; finally a best predicting model presented against each developed model. The models were trained with three different algorithms, that is, quick-propagation algorithm, batch backpropagation algorithm, and Levenberg-Marquardt algorithm using weekly data from 1999 to 2005. A best model for the simulation was selected from the three presented algorithms using the statistical criteria such as correlation coefficient (R), determination coefficient, or Nash Sutcliff ’s efficiency (DC). Finally, optimized number of neurons were considered for the best model. Training and testing results revealed that the models were predicting the weekly spring discharge satisfactorily. Based on these criteria, ANN-based model results in better agreement for the computation of spring discharge. LMR models were also developed in the study, and they also gave good results, but, when compared with the ANN methodology, ANN resulted in better optimized values.


Introduction
The process of discharge simulation from a spring is a very complex, highly nonlinear phenomenon having temporal and spatial variability.The weekly spring discharge modeling has a vital role in better management of the water resources management.Many models such as black box, conceptual, and physically-based models have been developed especially for rainfall, runoff, and sediment process.On the other hand, very few models are available for accurate estimation of discharge of spring, and, in many situations, simple tools such as linear theoretical models or black box models have been used with advantage.However, these models fail to represent the nonlinear process such as rainfall, runoff, and sediment yield [3].The successful application of artificial intelligence techniques such as ANNs has added a new dimension to model such complex systems in recent years to solve various problems in hydrologic/hydraulic engineering and water resources engineering [1,2].A new dimensional development came in existence in the recent years for better prediction and model development, namely, hybrid modeling, such as ANFIS, ANN-Fuzzy logic, and ANN-GA [3].Unlike mathematical models that require precise knowledge of all contributing variable, a trained artificial neural network can estimate process behavior.It is proven fact that neural nets have a strong generalization ability, which means that, once they have been properly trained, they are able to provide accurate results even for cases they have never seen before [4,5].
The artificial neural network, a soft computing tool, basically a black-box model and has its own limitations [6].The main advantage of the ANN approach over traditional methods is that it does not require the complex nature of the underlying process under consideration to be explicitly described in a mathematical form [7]. Other advantages of ANN over conventional models are discussed in [8].Mathematically, an ANN can be treated as a universal approximator having an ability to learn from examples without explicit physics [1,2,9].Such a model is easy to develop and yields satisfactory results when applied to complex systems which are poorly defined or implicitly understood.These are more tolerant to variable, incomplete, or ambiguous input data.The applications of ANNs in water engineering include the modeling rainfall-runoff-sediment process, forecasting of rainfall-runoff-sediment yield, river flow forecasting, reservoir inflow modeling and operation, setting up river-stage-discharge relationship, water-supplysystem optimization, evapotranspiration process, draught forecasting, groundwater quality prediction, and groundwater remediation [7,[10][11][12][13][14][15][16][17][18][19][20][21][22][23].

Artificial Neural Networks
Artificial neural networks are highly simplified mathematical models of biological neural networks having the ability to learn and provide meaningful solutions to the problems with high-level complexity and nonlinearity.The ANN approach is faster compared to its conventional techniques, robust in noisy environments, and can solve a wide range of problems.Due to these advantages, ANNs have been used in numerous real-time applications.The most commonly used neural networks in hydrology being three layered and four layered having input layer, where the input is fed to the network, hidden layer(s) where the data is processed, and output layer where the output will be presented Figure 1.
The processing elements in each layer are called neurons or nodes.The information flow and processing in the network is from the input layer to the hidden layer and from the hidden layer to the output layer.The number of neurons and hidden layers in the network is problem dependent and is decided by the trial and error method.A synaptic weight is assigned to each link to represent the relative connection strength of two nodes at both ends in predicting the inputoutput relationship.The output, y j of any node j, is given as where X i is the input received at node j, W i is the input connection pathway weight, m is the total number inputs to node j, and b j is the node threshold.Function f is called an activation function which determines the response of a node to the total input signal that is received.The commonly used activation function is sigmoid transfer function and is given by (2) Sigmoid function is continuous and differentiable everywhere, and a nonlinear process can be mapped with it [1].Backpropagation algorithm is mostly used in training of the feed-forward neural networks [2].In this algorithm, each input pattern of the training dataset is passed through network from the input layer to the output layer.The network generated output is compared with the target output, and the error is computed as where t i is the component of the desired output/target T, y i is the ANN generated output, p is the number of output nodes, and P is the number training patterns.This error will be propagated backward through each node, and correspondingly the connection weights will be updated.the Nainital, Dehradun, and Hardwar districts.On account of topography, the region has a peculiar geographical feature and is blessed by very good land and water resources.Though the whole economy of the region is based on agriculture, the total cultivable land is only 14 percent of the total geographical area of the region, with a good potential in conservation and utilization of natural resources.Especially the hill region is very rich in natural resources particularly in forests, minerals, and surface water.

Study Area and Data
The present watershed of the study area drains into the Henval river in the Tehri Garhwal district of the Uttarakhand state, covering an area of 871 ha (8.71 km 2 ).It is located between 78 • 22 28 and 78 • 24 57 E longitude and 30 • 17 19 and 30 • 18 52 N latitude.The elevation varies from 960 to 2000 m above mean sea level (MSL).The study spring is located at 78 • 24 34 E and 30 • 18 47 N at the elevation of 1844 m above MSL in dense forest area.For the present study, the weekly data of spring discharge, rainfall, evaporation, temperature were collected for seven years from 1999 to 2005.Study area was tracked along with GPS, and the location (latitude and longitude) of the Hill campus spring was noted with the help of GPS.The same locations were marked on the map with GIS environment, which is shown in Figure 2. Spring layer was overlaid with DEM to get the height of natural springs above MSL.This spring had lesser catchment area as it was nearer to the watershed divide.This spring is located at 78 • 24 34 E and 30 • 18 47 N at the elevation of 1844 m above MSL in dense forest area.

Model Development and Methodology
The output from the model is spring discharge at the time step, Q t .The spring discharge was mapped by considering in addition the current value of discharge and the discharge at previous time steps.Therefore, in addition to Q t , other variables such as and T t−1 , T t−2 were also considered for training of ANN in the present study are given in Table 1.In the development of one hidden layer architecture, 239 weeks for training, 55 weeks for testing, and 55 weeks for validation were taken, and, in two hidden layer architecture development, 234 weeks for training, 54 weeks for testing, and 54 weeks for validation were taken for the best possible model selection.
In the present study feed-forward Quick-propagation, Batch backpropagation, and Levenberg-Marquardt's ANN models have been used for the simulation of the spring discharge.The input-output datasets were first normalized considering the maximum value of the series, and the reducing the individual variables, in the range of 0 to 1 to avoid the saturation effect, may be possible by using sigmoid activation function.
The sigmoid function was the activation function used in the present study, and constant quick propagation coefficient 1.75 and learning rate 0.8 were selected by different hit and trial methods in the quick propagation algorithm for possible better optimization.The weights were updated after presenting each pattern from the learning dataset, rather than once per iteration.In the Batch backpropagation algorithm, a constant learning rate (η), 0.15 was used in the model development, and momentum rate (α) was used in the range of 0.6 to 0.9; moreover, local minima avoidance concept was used in the Levenberg-Marquardt algorithm.
The number of input nodes in the input layer was taken equal to the number of input variables.Since no guideline is yet available on the number of hidden nodes of the hidden layer (Vemuri, 1992) [9], these were initially tried from equal number of input nodes to double to that of input nodes (Hipel et al., 1994) [5].However, corresponding to one output, only one node was taken in the output layer.

Development of Linear Multiple Regression Models
In developing Linear Multiple Regression (LMR) models, the spring discharge at time t, Q t , is used as a criterion variable, and rain fall, evaporation, temperature, and spring discharge in the past are used as predictor variables.The LMR models can be represented as follows: where β i s represent the regression coefficients to be determined, R i s represent the rainfall, E i s represent the evaporation, T i s represent the temperature, Q i s represent the spring discharge, and t is the index representing time.
Model: LMR-1: (5) Model: LMR-2: The graphical representation along with the corresponding scattered plots of developed LMR models are shown in Figures 6 and 7.

Results and Discussion
It can be seen from Table 1 that correlation coefficient (R) is very high, that is, more than 93%,for all the ANN models during the training, testing, and validation phases, except ANN1, ANN6.There is no significant decrease in R values during validation when compared with the training phase.The best performed models in R statistic were ANN2 among one hidden layer models and ANN8 among two hidden layer models.The R values for ANN2 are 0.990, 0.970, and 0.983 during training, testing, and validation, respectively.The increase in R values during validation indicates the good generalization capability of the ANN model.The root One 0.990 0.979 0.970 0.920 0.983 0.964 11 [11-14-1] One 0.984 0.967 0.981 0.963 0.965 0.922 15 [15-15-1] One 0.996 0.993 0.977 0.952 0.974 0.944 Two 0.988 0.975 0.975 0.941 0.985 0.970

15
[15-11-14-1] Two 0.998 0.996 0.979 0.948 0.979 0.950 Hence, ANN2 and ANN8 are the best performance models for the study spring.Finally, the model ANN8 was selected on the basis of the overall performance for the spring discharge simulation for the current spring in the quick-propagation algorithm.It can also be seen from Tables 2 and 3 that the correlation coefficient (R) is very high, that is, around 95%, for most of the ANN models during the training, testing, and validation phases, except ANN1, ANN6 in both the batch backpropagation algorithm and Levenberg-Marquardt algorithm.There is no significant decrease in R values during validation when compared with the training phase.The best performed models in R statistic were ANN5 among one hidden layer models and ANN10 among two hidden layer models in Batch backpropagation algorithm, and, in Levenberg-Marquardt algorithm, ANN3 among one hidden layer and ANN10 among two hidden layers were the best performed models.The R values of Batch backpropagation algorithm ANN models for ANN5 are 0.985, 0.958, and 0.989, for ANN10 are 0.986, 0.957, and 0.983 during training, testing, and validation; respectively, the R values of Levenberg-Marquardt algorithm ANN Two 0.985 0.967 0.970 0.935 0.168 0.934   (Levenberg-Marquardt algorithm) were selected on the basis of the overall performance as best models Figures 4 and 5.
The aim of the current study is to present the representative algorithm, with a particular ANN model for the  spring discharge simulation for the current spring; in this line of context, all the presented models from the three algorithm are representative for the current spring discharge simulation, but the performance of the model ANN8 with quick-propagation algorithm is quiet good amongst all the best presented models.Hence, finally, the ANN8 model with quick-propagation algorithm is selected for the simulation of spring discharge in the study.The comparative plots presented for the observed and estimated spring discharges and their corresponding scatter plots for the best representative model for the study location during training, testing, and validation in Figures 3, 4, and 5.It is observed from the presented figures that there is a very little mismatch between the observed and estimated spring discharges during all the three phases.Scatter plots also show a good correlation and regression values and no much shift from the ideal line.els.The performance indicators throughout the study are correlation coefficient (R), coefficient of efficiency (R 2 ).In the Tables 1, 2, and 3 the Correlation Coefficient of ANN models varies from 90.0% to as high as 99.8% for the mixed data (training, testing, and validation), whereas the Correlation Coefficient for LMR models varies from 97.0% to as high as 97.6%.Coefficient of Efficiency for ANN models varies from 90.1% to as high as 99.8%, while the same for the LMR models varies from 94.1% to as high as 94.8%.By the comparison made in Table 4, it is clear that the performance of ANN models are better than that of LMR models for spring discharge.

Figure 2 :
Figure 2: Location of the study area.

Table 1 :
Selection of the best model from the batch backpropagation algorithm.

Table 2 :
Selection of the best model from the quick-propagation algorithm.

Table 3 :
Selection of the best model from the Levenberg-Marquardt algorithm.

Table 4
gives comparison of performance of the best five ANN models with Linear Multiple Regression (LMR) mod-

Table 4 :
Comparison of the best ANN and LMR models.