Rainfallrunoff simulation in hydrology using artificial intelligence presents the nonlinear relationships using neural networks. In this study, a hybrid network presented as a feedforward modular neural network (FFMNN) has been developed to predict the daily rainfallrunoff of the Roodan watershed at the southern part of Iran. This FFMNN has three layers—input, hidden, and output. The hidden layer has two types of neural expert or module. Hydrometeorological data of the catchment were collected for 21 years. Heuristic method was used to develop the MNN for exploring daily flow generalization. Two training algorithms, namely, backpropagation with momentum and LevenbergMarquardt, were used. Sigmoid and linear transfer functions were employed to explore the network’s optimum behavior. Crossvalidation and predictive uncertainty assessments were carried out to protect overtiring and overparameterization, respectively. Results showed that the FFMNN could satisfactorily predict stream flow during testing period. The NashSutcliff coefficient, coefficient of determination, and root mean square error obtained using MNN during training and test periods were 0.85, 0.85, and 39.4 and 0.57, 0.58, and 32.2, respectively. The predictive uncertainties for both periods were 0.39 and 0.44, respectively. Generally, the study showed that the FFMNN can give promising prediction for rainfallrunoff relations.
A hydrologic model can be categorized as (i) mathematical, (ii) physical, and (iii) analog. Physical model is a smallscaled view of a real phenomenon. Analog model is used for observing one process to create another physically similar natural process. Mathematical model includes obvious series of numerical logical steps and equations that transfer numerical inputs to numerical outputs [
In this research, feedforward modular neural network (FFMNN) is proposed for the development of rainfallrunoff modeling as a new generation of neural networks in Roodan watershed. A short literature review for hybrid structures such as modular neural network revealed some interesting studies. For instance, Wu [
This study, on the other hand, was done to perform pattern recognition via feedforward modular neural network (FFMNN) for the Roodan catchment situated at the southern part of Iran. As far as the authors are concerned, no similar studies using MNN have ever been done for the aforesaid catchment. The MNN in this study was developed to simulate daily flow via training, crossvalidation, and testing.
Feedforward modular neural network (FFMNN) is a special class of multilayer perceptron (MLP) and is often defined as an extension to multilayer perceptron. This also means that its hypothesis and training rules are the same as MLP [
Generally, MNN can learn pattern recognition and speed up training times. The general representation of FFMNN is shown in Figure
General construction of feedforward modular network.
The study area is located in the southern part of Iran between the Hormozgan and Kerman provinces, which is the Roodan watershed. The area of catchment is 10570 km^{2} and lies between northern geographical latitude of 26 degrees and 57 minutes to 28 degrees and 31 minutes and the eastern longitude of 56 degrees and 47 minutes to 57 degrees and 54 minutes (Figure
Roodan watershed in south of Iran (a); satellite image for the reservoir of Esteghlal (Minab) dam (b).
Generally, an artificial neural network such as FFMNN functions by learning variables relationships in training and then extending them to test conditions [
The modular feedforward network selected was first introduced by Deshmukh and Ghatol [
Applied transfer functions.
Transfer function  Description 

Sigmoid  Sigmoid (0/1) 
Linear sigmoid  Piecewise linear (0/1) 
Linear  Adds a bias and scales 
Generally, learning rule determines the relative significance of input weights to a process element. The most popular training algorithm is backpropagation (Momentum), which has been derived from gradient descent rule [
Applied Training algorithms.
Name  Description 

Backpropagation with momentum  Gradient and weight change 
LevenbergMarquardt  Improved second order method for gradient 
In neural networks, the data has to be standardized according to the training algorithm and then the data sets need to be divided for training and testing. The data should be normalized (standardized) because of equal consideration during the training stage. In this study, since sigmoid and linear functions were used, the data were scaled separately for sigmoid to be between 0 and 1; this method was suggested by Zadeh et al. [
Three data sets were involved in the development procedure—the training set, the crossvalidation set, and the test set (validation). The training dataset was first applied to train the FFMNN model configuration. The crossvalidation dataset was then applied to decide the training’s stopping time to prevent overfitting [
Several architectures (topologies) were developed in this study to find the optimum generalization for the training algorithms, transfer functions, and number of neurons. In the first step, a series of configurations were tested by combining training algorithms and transfer functions in the hidden and output layers. Table
Developed configurations of MNNs for Roodan watershed.
Architecture  Hidden layer 1  Output layer  

Module  Transfer function  Training algorithm  Transfer function  Training algorithm  
1  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Sigmoid  
2  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquardt or momentum 
Module 2  Sigmoid  
3  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Sigmoid  
4  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear sigmoid  
5  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear sigmoid  
6  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Linear sigmoid  
7  Module 1  Linear  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
8  Module 1  Linear  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
9  Module 1  Linear  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Linear  
10  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear sigmoid  
11  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquard or momentum 
Module 2  Linear sigmoid  
12  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Linear sigmoid  
13  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
14  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
15  Module 1  Sigmoid  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Linear  
16  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
17  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Linear sigmoid  *LevenbergMarquardt or momentum 
Module 2  Linear  
18  Module 1  Linear sigmoid  *LevenbergMarquardt or momentum  Linear  *LevenbergMarquardt or momentum 
Module 2  Linear 
The development of FFMNN was initiated by first choosing the data set for model learning and validation. This was followed by determining the input and output variables and scaling the data. After that, the network topology and specification for the number of cells for hidden layer were set. Then, the crude FFMNN was trained and tested to find the optimum results. This FFMNN had three layers, which were the input, hidden, and output layers. The hidden layer had two modules (neural expert).
It was obvious at this point that the number of cells in the input layer was correlated with the input data vector and the same relationship was found between the output layer and its vector. Normally, one hidden layer is enough for rainfallrunoff modeling by ANNs. However, to find the optimal network architecture is a task that is highly dependent on the number of hidden layers. In this study, the layers were evaluated layer by layer, and this was done using the heuristic approach, deemed as a usual method, as suggested by Bowden et al. [
By using the forward approach to combine the input variables, the FFMNN architectures were trained to capture the dynamic, complex, and nonlinear rainfallrunoff mechanism in the Roodan watershed with harmony of transfer function and normalized data. At this stage, the FFMNN was developed by combining input data, transfer functions, and training algorithms in the hidden and output layers (Table
The hydrological model’s accuracy can be evaluated using many approaches, for example, the methods proposed by the World Meteorological Organization (WMO) which can be generally divided into graphical evaluation and numerical assessment [
linear scale plot of the predicted and measured data,
double mass plots of the estimated and real data.
The numerical assessment can be carried out in two forms as well, that is, absolute goodnessoffit and relative goodnessoffit [
(a) Coefficient of determination is presentd as
(b) NashSutcliffe (NS) is presented as
(c) Root mean square error (RMSE) is presented as
Generally, the predictive uncertainty (PU) of the ANN is assessed using the noisetosignal ratio index [
The optimum developed FFMNN was found via challenging heuristic method that considered (i) different topologies training; (ii) combination of input variables; (iii) increasing and decreasing number of neurons in hidden layer for both modules; and (iv) exploration on the learning rate and momentum term to bring better generalization through trial and error. Figure
Training and crossvalidation curves attributed with MSE for FFMNN.
Table
Optimum architecture of FFMNN in Roodan watershed.
Input variable*  Hidden layer 1  Output layer  

Module  Transfer function  Learning rule  Transfer function  Learning rule  

Module (1): 38 cells  Sigmoid  LevenbergMarquardt  Linear: 1 cell  Momentum 
Module (2): 26 cells  Linear sigmoid 
In this study, the suitable momentum and step size (learning rate) values were found through trialanderror. The momentum speeds up the training in very flat regions of the error surface. A learning rate is applied to increase the possibility of preventing trapping in local minima as an alternative of global minima [
The NScoefficients for calibration and validation period were 85% and 57%, respectively, for the optimal FFMNN. The calibration period gave good performance while that of validation period was moderate, as defined according to Tombul and Oĝul [
Examining indices for accuracy of FFMNN model.
Index  Calibration  Validation 

Nash and Sutcliffe coefficient % (NS)  85  57 
Coefficient of determination % ( 
85  58 
Root mean square error  39.4  32.2 
The daily stream flow in m^{3}/s (CMS) was evaluated through graphical visualizations and statistical analysis to give a good cognition for reviewing observed and simulated daily stream flows. Figures
Measured and simulated stream flow (CMS) for training period.
Measured and simulated stream flow (CMS) for test period.
Figures
Cumulative daily stream flow m^{3}/s (CMS) for calibration period.
Cumulative daily stream flow m^{3}/s (CMS) for validation period.
Generally, the daily cumulative flow trend is similar between observed and simulated daily flow for the training period, though there is a slight sustainable underestimation. In addition, there is an overestimation for the test period except in early 2005. The dissimilarities between observed and predicted flows for both periods may be due to the capability of the network and the significance of nonlinearity in rainfallrunoff relationships for largescale watersheds with aridity climate.
The percentile absolute errors between observed and simulated flows are shown in Table
Percentile of absolute error value between observed and simulated flow (CMS).
Percentile %  

5  10  25  50  75  90  95  
Validation  0.75  0.8  1  1.27 



Calibration  0.05  0.09  0.34  0.87  2  6.2  19.7 
The predictive uncertainty obtained for training and testing periods was 0.39 and 0.44, respectively (Figure
Uncertainty prediction of training and test period for FFMNN.
This study has proposed a feedforward modular neural network for a large catchment with aridity climate for rainfallrunoff prediction. The FFMNN was developed through training, crossvalidation, and testing. LevenbergMarquardt and backpropagation with momentum terms were used to develop the training algorithms. Sigmoid and linear transfer functions were applied to compute the neuron output. The developed FFMNN gave good and fair predictions for training and test. The absolute errors according to 50 percentiles in the test period were less than those in training period. The uncertainty prediction obtained was satisfactory for both periods. To conclude, feedforward modular neural networks can be promising as new generation of neural networks for flow prediction.
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors appreciate the cooperation and help given by the Department of Hydraulic and Hydrology and Centre of Information and Communication Technology (CICT) of Universiti Teknologi Malaysia; consultant engineers of Ab Rah Saz Shargh Corporation in Iran; and the Regional Water Organization, Agricultural Organization, and Natural Resources Organization of the Hormozgan province, Iran.