Application of Metaheuristic Algorithms and ANN Model for Univariate Water Level Forecasting

With the rapid development of machine learning (ML) models


Introduction
Te river water level (WL) is commonly measured to determine the water volume fowing through the river.It is also applied to comprehend food or drought scenarios.As a result, a timely and accurate forecast of WL is critical for water resource planning and disaster risk decrease [1,2].WL prediction is an essential responsibility of hydrologists, engineers, and relevant authorities in designing a workable conceptual design of water infrastructures and drought management measures and assessing river/lake/reservoir behaviour for operational purposes [3,4].
Iraq is particularly susceptible to the efects of global warming as a country in an arid to semiarid environment.In general, this country is sufering from a shortage of water, which is anticipated to worsen in the future due to various issues, including climate change, the growth of the oil sector industry, urbanisation, and the high pace of the population increase [5].Iraq's primary freshwater sources are the Euphrates and Tigris Rivers.From 2009 to 2014, both rivers sufered from signifcant water scarcity.Tis trend is only expected to deteriorate due to climate change and the policies of water source nations such as Iran and Turkey, which were building and operating new dams along the rivers' routes.In addition, after 2003, terrorism targeted several barrages and dams in Iraq, adversely afecting river water management [6].
Artifcial intelligence (AI) models have proven reliable tools for capturing nonlinear patterns, so researchers have used them in various hydrological studies [7].AI models have shown exceptional performance in forecasting hydrological factors in recent years.Tese models' features can receive a substantial quantity of data and can be applied to numerous climatic parameters and other hydrological boundary factors [8].According to the published literature survey, multiple AI models for water level modelling have been built, including artifcial neural network (ANN) [9], adaptive neuro-fuzzy inference (ANFIS) [10], support vector machine (SVM) [11], and random forests (RFs) [12].Te advantages and disadvantages of the models mentioned above (main ML models) are covered according to diferent topics of hydrology felds, such as drought [13] and water quality [14].
One of the most signifcant drawbacks of prior studies is that some of them use the trial-and-error procedure, which results in a lengthy time frame, while others have not improved the data through preprocessing the data.
Among the various AI approaches, ANN can be viewed as an active technique for solving nonlinear issues and producing precise predictions [15,16].However, ANN may fall into a local rather than a global minimum, resulting in a suboptimal solution or failure to use the proper network structure or hyperparameters for neural network training.To avoid these disadvantages, diferent approaches, such as metaheuristic algorithms, have been incorporated with the ANN model, and various hybrid models have been suggested [15].Hybridisation has appeared as an excellent potential technique for eliminating the apparent disadvantages of standalone approaches while also improving forecasting accuracy [17].
Numerous felds of engineering face the traditional optimisation challenge of fnding the optimum solution within a wide and uncertain space.When an analytical solution is impractical or time-consuming, numerical techniques can be helpful.However, they cannot ensure a globally optimum outcome due to the high likelihood of settling for a local minimum.Many metaheuristic algorithms are motivated primarily by nature.In addition to creating new metaheuristic algorithms, hybridising algorithms is another tactic for enhancing algorithm performance [29].Additionally, it was stated in the no free lunch theorem (NFLT) [30] that no universally applicable algorithm can efciently handle every likely optimisation scenario.In other words, the performance of an optimisation method may be very good on some problems, but it is likely to be quite bad on others.Tis has resulted in the scientifc community proposing numerous strategies for resolving optimisation challenges.
A variety of metaheuristics algorithms approaches were used in the felds of hydrology to estimate the hyperparameters of machine learning (ML) techniques instead of the trial-and-error approach [17], such as the slime mould algorithm (SMA), which was developed by Li et al. [31].SMA was used to solve various optimisation problems, including optimal power fow problems [32] and water demand forecasting [33].Also, the marine predator algorithm (MPA) was introduced by Faramarzi et al. [34].It is a meta-heuristic algorithm based on population [35], which has been applied to solve multiple optimisation topics, such as power resources [36] and estimation of greenhouse gas emissions [37].Additionally, particle swarm optimisation (PSO) has been employed in multiple hydrology areas, for example, WL [38] and streamfow [39].Moreover, the constriction coefcient-based particle swarm optimisation and chaotic gravitational search algorithm (CPSOCGSA) was proposed by Rather and Bala [40], and it is used in the prediction of drought [41].Te CPSOCGSA algorithm was proposed under the strategy of hybridising the existing algorithms.In contrast, the MPA and SMA algorithms were developed under the new tactic of the nature-inspired optimisation method.
In the same context, data preprocessing has also been emphasised in the literature to improve time series quality and fnd optimal predictors.In recent years, cleansing data has become increasingly crucial.Tis has led to the implementation of various signal pretreatment strategies to decrease the impact of noise in the water level time series, for example, wavelet transform (WT) [42] and singular spectrum analysis (SSA) [15].Another crucial part of data pretreatment is selecting the optimal input data, for example, mutual information (MI) [43], for a univariate scenario.Employing a nonlinear statistical reliance technique, for example, MI, is appropriate for choosing inputs to ANN models [44].
Hajirahimi and Khashei [17] recently surveyed numerous hybridisations of hybrid structures for time series forecasting.Findings from this study highlight the signifcance of pretreatment methods and optimisation algorithms in the hybridisation process.Te hybridisation of hybrid models has been suggested as a novel idea to achieve highaccuracy prediction, where two or several combined classes are hybridised instead of using the normal individual 2 Advances in Civil Engineering prediction techniques.One of these procedures that were used efectively is hybridisation of parameter optimisation based with preprocessing-based hybrid models (HOPH).Also, the research recommended that there are still rooms for improvement in the data pretreatment approach and optimisation algorithms.Accordingly, knowledge gaps and promising new research directions of study related to the hybridisation of hybrid models need to be investigated.Also, Mohammed et al. [18] reviewed the watershed-level prediction articles published from 2014 to 2021 and recommended employing all three data preprocessing approaches to improve original data quality and select the optimal predictors.Additionally, SSA is used for denoising data.Moreover, utilising the HOPH technique to forecast WL data because it can optimise the model and the data, there is still space for improvement in WL forecasting.Tis study, therefore, considers a novel hybrid methodology called HOPH to predict the WL of the Tigris River (upstream of the Al-Kut barrage) by utilising a set of preprocessing techniques, an ANN model, and metaheuristic algorithms.Te study's signifcance or value is that the Kut Barrage manages and provides freshwater from the Tigris River to the Dujaila and Al-Gharraf branches by lifting the WL in the barrage upstream.Tese two branches deliver water for several cities' growth and prosperity in the south of Iraq, which are already under the stress of a water shortage.
Te following steps will be taken in order to accomplish these goals: (1) To use the SSA approach to enhance the raw data quality and the MI method to choose the optimal predictor (lags) scenario during the preprocessing stage (2) To incorporate the MPA algorithm with the ANN technique to forecast monthly WL data (3) To evaluate the MPA-ANN algorithm's performance using hybrid SMA-ANN, PSO-ANN, and CPSOCGSA-ANN (4) To use the HOPH strategy for estimating monthly water levels depending on multiple time lags (5) To extend the predicting range and decrease the level of uncertainty in monthly water level simulation results by testing various update optimisation algorithms (i.e., two update algorithms and hybridisation of two existing ones) To the authors' knowledge, this hybrid technique has been employed to simulate the water level for the frst time.Also, this is the frst time to predict Al-Kut barrage upstream water levels based on several lags.It is crucial for the local authority because it is responsible for managing and providing freshwater for all the cities in the south of Iraq, which are already under water stress.

Case Study and Data Used
Al-Kut city is the centre of Wasit Province, Iraq.It is located on an essential site on the Tigris River.In terms of spatial location and relationships with nearby regions, two river branches (Al-Gharaf and Al-Dujaili) branch out close to the north of the city, as indicated in Figure 1.Tese two branches deliver freshwater to diferent cities in Wasit and Ti-Qar provinces for residential, irrigation, commercial, and industrial purposes [6].
Historical monthly water level time series (metre, m) of the Tigris River in Al-Kut city (Al-Kut barrage upstream) was collected from the directorate of water resources in Wasit Province for the period (2011-2020).Figure 2 illustrates the monthly raw WL time series data.

Methodology
Te suggested methodology for forecasting monthly WL data falls under four headings (Figure 3): (1) data preprocessing, (2) MPA algorithm, (3) ANN model, and (4) model performance criteria: 3.1.Data Preprocessing.According to Maier and Dandy [45], in order for an ANN model to function properly, it is necessary to pretreat data in an appropriate format.Tese strategies ensure that every input obtains equal attention in the training phase.Tis research conducts three stages: normalisation, cleaning, and selecting the optimal predictors.Tabachnick and Fidell [46] stated that univariate outliers could be mitigated by frst variable transformation and then changing outliers' scores if found.In order to reduce multicollinearity between predictors, raw water level data were normalised using the natural logarithm method [46].Te cleaning approach aims to improve the value of the regression coefcient and reduce the error scale by treating the outliers and denoising the data [15].Tis research used the box-whisker technique to identify outliers over the range of ±1.5 (IQR � Q3 − Q1, Q3 � 3 rd quartile, and Q1 � 1 st quartile) [41].Tis approach's methodology was carried out using the SPSS 24 statistical package.Te preprocessing signal is one of the fnest ways of denoising the raw dataset using the singular spectrum analysis (SSA) approach to split it into multiple components.
Te SSA is a comparatively practical technique for analysing the raw data into several principal components (PCs).Each PC characterises some measure of variation in the original data, with the frst signal having the highest value and the last signal having the lowest.Selecting the PCs that account for the most variance and disregarding the PCs that account for the least variance is one way in which the SSA can be used to reduce structure noise in data [47].Tis method can analyse linear and nonlinear time series with long, medium, and short-term sample sizes.It does not require statistical assumptions like error normality, linearity, or series stationary [48].More information about the SSA approach can be found in a study by Golyandina and Zhigljavsky [49].
Choosing relevant predictors is a crucial part of building a prediction model structure [50].Tis research applied the mutual information (MI) approach to determine the ideal explanatory variables (lags).MI assesses the statistical relationship between the lagged components and the time series.Tis method helps choose the most signifcant correlation components with higher MI [15].

Overview of the Marine
Predator Algorithm (MPA) for ANN Optimisation.Faramarzi et al. [34] suggested MPA as one of the most recent optimisation algorithms inspired by nature.Te MPA algorithm was inspired by the motion of marine predators such as sunfsh and sharks.MPA has been employed to solve diferent optimisation problems, including solving the economic emission dispatch [51] and estimating photovoltaic module parameters [52].

3.2.1.
Step One: Prey's Population Initialisation.Te MPA begins by establishing a baseline random solution group X 0 in accordance with the following equation.Tis sampling of individuals is produced at random inside the search domain: where X lb is the lower bond of variables, X ub is a variable's highest possible bond, and r is a random vector from (0-1).

3.2.2.
Step Two: Predator Matrix Creation.Predators and prey are both regarded as search agents in the MPA due to the search for their own food.Te elite is the top predator in the search agents, who is normally more skilled than the other search agents.Te elite matrix is mathematically modifed depending on information about prey locations.Te following is how the elite and prey matrices are built: Elite � (2)

3.2.3.
Step Tree: MPA Optimisation Process.After creating the prey and elite matrices, the prey and predator locations are modifed in three stages.Tese stages are determined by the velocity ratio between the prey and the predator.Te high-velocity ratio, unit velocity ratio, and low-velocity ratio are the three phases.

Phase One:
High-Velocity Ratio.Te predator moves faster than the prey during this phase.Additionally, as shown in equations ( 5) and ( 6), prey movements have their step size altered:  Advances in Civil Engineering where R denotes a constant number and P is a random vector whose all elements have values between 0 and 1. Brownian motion is represented by the random vector R B .
Te ⊗ symbol represents the process of element-wise multiplication.
Tis stage happens in one-third of the total iterations' number (i.e., 1/3t max ).

3.2.5.
Phase Two: Unit Velocity Ratio.Tis phase is meant to mimic the hunt for food or prey.Levy fight represents the movement of the prey, while Brownian motion represents the predator's movement.Tis phase happens during the second-third of all iterations (i.e., 1/3t max < t < 2/3t max ).Te following equations can be used to represent the frst 50% of the population:

Advances in Civil Engineering
where the Levy distribution number is R L .To the remaining ffty percent of the population, ( 5) and ( 6) are used: CF: the parameter that controls the movement of predator step size.
3.2.6.Phase Tree: Low-Velocity Ratio.Tis stage is the fnal one in the optimisation process and estimates the predator's motions when it is quicker than the prey.It happens in the latter third of all iterations (i.e., 2/3t max ): ) . (7)

3.2.7.
Step Four: Eddy Formation and FADs.It is also possible to include environmental parameters in the simulation, such as the fsh aggregation device (FAD) and eddy formation.Te FAD's impact is where r is the random value in a range (0-1), the random indices from the prematrix are referred to as r1 and r2, the FAD's probability is FADS, and the binary vector is U.

3.2.8.
Step Five: Marine Memory.Successful foraging positions are well remembered by marine predators.It was a simulation in which the MPA was instructed to save the ftness values of the solution after every iteration and make a comparison of them to ftness values from subsequent iterations.

Artifcial Neural Network (ANN).
ANN is one of the most common ML models that are successfully used in various science and engineering applications.A major advantage of the ANN model is its capability to simulate nonlinear relationships.In recent years, the multilayer feedforward neural network (ML-FF-NN) has been shown to have strong predictive ability in a variety of hydrologyrelated domains.Te ML-FF-NN architecture consists of at least three layers: the input, the hidden/middle, and the output [53][54][55][56].Tomas et al. [57] explored whether utilising ML-FF-NN with two hidden layers raising generalisation compared to employing just one hidden layer.According to the study, two-hidden-layer networks outperformed generalisations in nine out of ten situations.Additionally, ANNs with two hidden layers have been shown to be effective in capturing the nonlinear relationship between estimated and actual in several research studies, such as Zubaidi et al. [15], Tortajada et al. [58], and Farzad and El-Shafe [59].Te input layer contained all of the parameters the user entered.Ten, the calculations took place in the hidden layer.Te fnal output vector was calculated at the output layer [60].
Accordingly, in this study, ANN uses four layers (two hidden layers).Te number of nodes in the input layer refers to the lag water level, and the water level (target) refers to the output layer (Figure 4).
Te learning algorithm's primary function is to fne-tune the network's settings, such as its weights and biases [61].Tis modifcation was implemented to guarantee that the predictions had tolerable error limits.For this reason, the ftness function is often referred to as the error signal presented via the mean squared error (MSE) [60].Te Levenberg-Marquardt algorithm (LM) is often used for ML-FF-NN [50,62], and the linear and tansigmoidal activation functions were adopted in both the output and hidden layers, respectively.
In addition, there were a number of signifcant difculties and issues with ANN modelling that required additional studies [60], such as the number of neurons in each hidden layer and the learning rate coefcient.Tis research determined the ANN hyperparameters using the recent metaheuristic algorithms (i.e., MPA, CPSOCGSA, PSO, and SMA).
Time series were categorised into three groups, which include the training set (consisting of 70% or 82 data points), the testing set (consisting of 15% or 17 data points), and the validation set (consisting of 15% or 17 data points), respectively, as by Zubaidi et al. [15].

Model Performance Criteria. Prediction errors are crucial
for choosing the right models and for providing information that can be used to suggest changes to current models that will lower forecast deviations in the future [63].Te model performance evaluation metrics chosen include the mean absolute error (MAE), root mean squared error (RMSE), mean bias error (MBE), mean absolute relative error (MARE), coefcient of determination (R 2 ), and scatter index (SI) in equations ( 9)-( 14), respectively.In addition, the residual analysis plot test and the Taylor diagram test were utilised: 6 Advances in Civil Engineering where the size of the data is N.

Results and Discussion
4.1.Input Data Analysis.First, time series for water levels were normalised (raw water level data are free of outliers and still free after normalisation).Figure 5 illustrates the normalised WL dataset.Te normalised time series was then decomposed using SSA into several components to obtain noise-free time series.Figure 6 depicts the normalised data (upper row), the improved time series data (2 nd row), and noise signals (3 rd and 4 th ).
In addition, as show in Figure 7, the MI approach was utilised to select the ideal input model (lags) scenario for the forecasting model.According to the literature, the time lag is chosen as the initial minimal level of average mutual information (AMI) [69].Depending on the AMI fgure, four monthly lags (Lag t−1 to Lag t−4 ) of historical WL data were utilised to estimate WL data in the future.
Table 1 displays the correlation coefcients between the target (i.e., future WL) and independent components (lags of WL) in those raw and pretreatment data phases.Te table shows that preprocessing data approach considerably enhanced data quality, for example, raising the coefcient of correlation for Lag t−1 (from 0.648 to 0.938).Te improvement of the data comes from the normalisation method, which reduces the variance and removes noise.After that, the data were divided according to Section 3.3.

Application Hybrid Algorithms-ANN Methods.
Te MPA, SMA, CPSOCGSA, and PSO algorithms were utilised to improve the ANN technique by locating the optimum hyperparameters of the ANN (using the MATLAB toolbox).All algorithms, using population sizes of 10, 20, 30, 40, and 50, found the optimum number of hidden nodes and the optimum learning rate coefcient of the ANN technique.To decrease uncertainty and increase the forecasting range, every swarm size was duplicated 5 times to reduce uncertainty, for example, see Figure 8 for the MPA-ANN technique.Te fourth MPA-ANN application is ideal for the 10-swarm size since it has the lowest error.It was selected and merged with the ideal application for the other swarm sizes.
As shown in Figures 9(a)-9(d), a swarm size of 40 provides the optimal solution for the CPSOCGSA-ANN, SMA-ANN, and MPA-ANN algorithms, whereas a swarm size of 50 delivers the optimal solution for the PSO-ANN algorithm.Analysing the ftness function values for each algorithm in detail reveals that the MSE for the MPA-ANN algorithm is 0.005701 (after 70 iterations).In contrast, SMA-ANN and CPSOCGSA-ANN algorithms did not improve beyond MSE equal to 0.006308 and 0.005756, respectively.Te PSO-ANN technique only achieves its best MSE of 0.005722 after 196 iterations.Although this research focuses on accuracy, performance times for each algorithm were recorded.Each algorithm consumed a diferent amount of time from the other in each swarm; for example, the MPA-ANN algorithm took about 12 minutes and 58 seconds when applying swarm 10, while the CPSOCGSA-ANN, SMA-ANN, and PSO-ANN algorithms took about 6 minutes and 37 seconds, 11 minutes and 54 seconds, and 9 minutes and 50 seconds, respectively.However, according to the above results, Table 2 summarises the hyperparameters of the ANN techniques for the optimal swarm for every optimisation algorithm.

Evaluating and Comparing the Techniques' Performance.
In accordance with Tao et al. [2] and Ghorbani et al. [22] methodology, the hyperparameters in Table 2 were used to confgure four ANN models.Multiple iterations of each ANN     Advances in Civil Engineering method were performed to identify the best network that consistently solves the problem.Tere were fve statistical metrics used to evaluate the methods' efcacy (see Section 3.4 for more information).Table 3 demonstrates the statistical requirements for every technique.Te MPA-ANN, CPSOCGSA-ANN, and PSO-ANN approaches ofered R 2 of equal or bigger than 85%, which are good fndings according to Dawson et al. [66].In contrast, SMA-ANN yielded R 2 less than 0.85.However, the MPA-ANN model outperforms the other three models, with an R 2 of 0.94.Furthermore, MPA-ANN outperforms the other methods in MAE, MARE, RMSE, and MBE tests, such as the MBE values of MPA-ANN, PSO-ANN, SMA-ANN, and CPSOCGSA-ANN are 0.0006, 0.0062, 0.0111, and 0.0057, respectively.Tis table highlights that the MPA-ANN approach is the ideal one for predicting WL time series in the validation phase.Te level of concordance between observed and predicted behaviour is summarised graphically in this fgure, taking into consideration the root mean square error diference (RMSD), standard deviation (SD), and correlation coefcient (R) [2,70].Te reference point depicts the measured water level on the Taylor diagram's X-axis.A technique close to the observed point (reference) is thought to be better.It thus provides an efective evaluation of the comparative performance of various models.According to Figure 10, the MPA-ANN model achieved better R and lower RMSD and SD when compared to the measured point.Te outcomes, as revealed in Figure 10, confrm the outcomes of Table 3 and reveal the superiority of the MPA-ANN model in predicting the WL data.

MPA-ANN
Moreover, an error analysis was performed to check the prediction models' goodness of ft. Figure 11 shows the error scatterplots against the number of samples for WL data.Tree important patterns can be inferred from the data presented in the fgure: (1) the average error for the MPA-ANN model was much nearer to zero than other models, (2) the pattern of distribution does not follow any noticeable trends, and (3) the error density follows a regular distribution across all data.Additionally, with MPA-ANN data, the margin of error was only ((−0.009)-(0.013))m in comparison to the CPSOCSA-ANN, SMA-ANN, and PSO-ANN models, which are ((−0.005)-(0.016))m, ((−0.003)-(0.023))m, and ((−0.013)-(0.020)),respectively.Te result thus obtained for error analysis is compatible with previous results.
Overall, the MPA-ANN model outperforms the other hybrid models.Accordingly, SI is employed to check this model's efciency and durability.According to the boundaries in Section 3.4, the MPA-ANN has excellent results with SI � 0.002.Additionally, the MPA-ANN model was further supported by the residual analysis.Te fndings illustrate that the residual data of the MPA-ANN model have a normal distribution based on the signifcant values, according to the Shapiro-Wilk and Kolmogorov-Smirnov tests.Te data are normalised when the signifcant (p value) > 0.05 [71].
Faramarzi et al. [34] created the coupled MPA method to justify the global solution by incorporating multiple methods and strategies during the improvement process.In the biological relationship between predators and prey, diferent foraging strategies have had a major impact on MPA.As a result, the Brownian and Levy fight (LF) distributions were developed to exhibit a professional explorerexploiter tendency and enhance searchability in every performance signifcantly.Tis allowed the MPA approach to accurately identify the global optimum solutions to the improvement problems investigated here.Te outcomes of the present research support the hypothesis of the HOPH technique.It is also consistent with the previous literature, such as Tikhamarine et al. [72] and Wang et al. [73], in the hydrological felds.
Tese are the most notable results of this study: (1) Tese fndings emphasise the SSA's potential utility in enhancing raw data quality by removing the noise from time series and mutual information technique to determine the ideal model input scenario.(2) MPA has proven to be a trustworthy algorithm when combined with the ANN method for estimating WL data compared with CPSOCGSA, PSO, and SMA algorithms.(3) Multiple statistical criteria analyses (i.e., MAE, RMSE, R 2 , MBE, graphical tests, and residual analysis) showed that the proposed methodology (i.e., HOPH) accurately predicted the WL time series.(4) Using four metaheuristic algorithms to integrate the ANN model, each algorithm was performed with fve swarms, and each swarm was duplicated fve times, leading to an increase in the forecasting range and lower uncertainty.(5) Te research fndings ofer valuable scientifc information that helps decision-makers forecast WL information with low uncertainty.

Conclusion
Tis study uses a novel hybrid model combining data pretreatment with a recent optimisation algorithm (MPA) and ANN model to simulate the monthly WL of the Tigris River, Al-Kut city.Te ability of the MPA algorithm to enhance ANN model performance was compared to the CPSOCGSA, SMA, and PSO algorithms.Figure 3 summarises this methodology, and the main conclusions are as follows: (i) Te results show that data pretreatment techniques successfully enhanced data quality (i.e., denoising time series by SSA) and chose the optimal model input scenario (i.e., selecting lags by MI).(ii) Depending on several statistical criteria, MPA-ANN tends to be superior to other hybrid techniques.Generally, the proposed methodology provides excellent to good performance to forecast monthly WL.(iii) Tese outcomes can assist local governments, such as stockholders and managers, with valuable information that can improve the water sector company's irrigation system administration and service and resource management.(iv) For future work, it is advised to conduct more studies on HOPH prediction models (especially the ANN model) because preprocessing approaches and specifying the hyperparameters of soft computing models have much room for improvement.

Figure 3 :
Figure 3: Flowchart depicting the steps required to simulate future WL data.

Figure 4 :
Figure 4: Structure of the ANN.

Figure 5 :
Figure 5: Normalised and cleaned monthly WL data.

Figure 6 :Figure 7 :
Figure 6: Time series of water levels after normalisation, also the three SSA components.

Figure 8 :
Figure8: Te MPA-ANN method performed well in 5 trials for every swarm size.

Figure 11 :
Figure 11: Error distributions for the suggested hybrid techniques.
Te WL observed and simulated data are O i and F i .Te average value of the actual WL data is O. Te mean value of the estimated WL data is F.

Table 1 :
Te coefcient of correlation between WL and lags of WL.Correlation is signifcant at the 0.01 level (2-tailed).

Table 2 :
Hyperparameters of the suggested hybrid models. 1 and N 2 are the numbers of neurons in the 1 st and 2 nd hidden layer, respectively, and the learning rate is Lr. N

Table 3 :
Performance evaluation for the validation data phase.