Application of Extreme Learning Machine Algorithm for Drought Forecasting

Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan Department of Statistics, Federal Urdu University of Arts, Science and Technology Islamabad, Islamabad, Pakistan Department of Mathematics, College of Sciences and Arts (Muhyil), King Khalid University, Muhyil 61421, Saudi Arabia Department of Mathematics and Computer, College of Sciences, Ibb University, Ibb 70270, Yemen College of Statistical and Actuarial Sciences, University of the Punjab, Lahore, Pakistan Mathematics Department, College of Humanities and Science, Prince Sattam Bin Abdulaziz University, Al Aflaj, Saudi Arabia Administration Department, Administrative Science College, 6amar University, 6amar, Yemen


Introduction
Drought is a recurrent natural climatic phenomenon that occurs virtually in most parts of the world. A drought is a recurrent event due to a lack of precipitation for an extended period of time in a particular region [1,2]. Like other natural hazards, drought is steady and sometimes considered a creeping phenomenon as it is a gradually evolving natural hazard due to climatic fluctuations [3,4]. Generally, the or end [9,10]. Prolonged droughts adversely impact the economic agriculture and social sectors. ese massive drought impacts are due to sudden and widespread climate changes [11]. Drought can lead to devastating economic effects, with worldwide losses of around $9 billion per annum; the US livestock industry faced a $400 million loss during a severe drought in 2002 [12]. A comprehensive early warning system for drought is necessary to reduce its devastating impact. However, a few studies are conducted to mitigate this stochastic natural hazard [13][14][15]. Recently, numerous drought indices have been developed to identify and monitor droughts and introduce mitigation policies [16][17][18][19]. Reference [20] proposed a new drought indicator, i.e., Normalized Ecosystem Drought Index (NEDI), to observe dryness conditions in the pattern of a transitional ecosystem. It is expected that dryness conditions can be quantified better by using NEDI.
Numerically expressed drought indices are more understandable than natural rainfall data [1,2,21]. Drought indices can be a valuable tool to detect the initiation and termination of drought levels necessary for recovery planning, mitigation, and decision-making [22,23]. Drought indices aim to quantify how drought conditions evolve and classify the severity of drought events. ese indices made easy droughts modeling using stochastic time series, neural network algorithms, and water balance models. e most commonly used drought indices are Palmer Drought Severity Index (PDSI) [24], Surface Water Supply Index (SWSI) [25], Standardized Precipitation Index (SPI) [26], Effective Drought Index (EDI), and Standardized Precipitation and Evapotranspiration Index (SPEI) [27]. Ali et al. [28] proposed a multiscalar drought index named as Standardized Precipitation Temperature Index (SPTI). ese drought indices were calculated using different meteorological variables [29]. Different drought indices were used to characterize, estimate, and forecast drought conditions. e current long-range drought forecasts have minimal reliability [30]. Existing conventional stochastic models are inadequate for accurate drought predictions [31]. e recently developed machine learning (ML) models have extensive application in climatology including Naïve Bayes classifier, Bayesian networks [32], support vector machine (SVM), wavelet gene expression programming [33], maximum entropy, and artificial neural networks (ANNs). Results of several studies affirmed that the ML models perform comparatively better than conventional stochastic and dynamic models for drought estimation [34,35]. ANN models act like a human brain and can be classified according to their neuron structure, number of hidden layers, and activation functions. Many researchers have successfully applied MLP neural networks for drought estimation and forecasting [36,37]. e MLP is capable of accurately forecasting soil temperature in semi-humid and arid regions [38]. Aghelpour et al. [39] improved agriculture drought modeling by coupling the dragonfly optimization algorithm with SVM. Furthermore, [40] efficiently modeled RDI using hybrid support vector regression (SVR) coupled with firefly algorithm (FA), whale optimization algorithm (WOA), and wavelet analysis (WA). e results proved that hybrid and coupled SVR techniques improved drought forecasting. Although ML models have an outstanding reputation in estimation, prediction, and forecasting, many have slow computing times [41]. Among the class of ANN algorithms, ELM is being widely used in various fields and has gained fame in climatology and engineering [42][43][44][45][46][47]. Mouatadid and Adamowski [48] efficiently forecasted urban water demand for Montreal (Canada) using ELM.

Data and Study Area.
e application of this research is based on nine meteorological stations scattered around Pakistan.
e topographic map of the study region and distribution of selected meteorological stations is shown in Figure 1. e study area is situated in the southeastern part of Asia and lies between 23.8°to 37°N latitude and 60.9°to 75.37°E longitude. e region is classified into clusters comprising different meteorological stations with diverse spatial characteristics [61]. Hence, selecting these meteorological stations aims to cover the maximum climatic variability. In addition, the study area encompasses five major river basins, Ravi, Chenab, Sutlej, Jhelum, and the Indus River. ese rivers are the backbone of the country's agriculture industry and hydropower projects.
For this research, time series data of the monthly precipitation and minimum and maximum air temperatures were collected from the Karachi Data Processing Center (KDPC) through the Pakistan Meteorological Department (PMD). e length of the data ranges from January 1951 to December 2016. e full-length data were split into two parts. January 1951 to December 2013 is considered the training data set, and the remaining three years, January 2014 to December 2016, is considered the test data set. e climatological forecast needs more accuracy for future hazard mitigation because long-range climatological forecasts compromise accuracy. Here, the errors and irregularities were detected and removed by the KDPC itself. Additionally, missing data were adjusted by generating values using cumulative distributions over lead periods.

Standardized Precipitation Temperature Index (SPTI).
e Standardized Drought Indices (SDIs) have extensive applications for drought monitoring. SDIs are standardized and spatially invariant tools for monitoring and assessing drought characteristics. In the literature, various authors have offered numerous methods for SDIs. Example includes the Standardized Precipitation Index (SPI) [26], Standardized Precipitation Evapotranspiration Index (SPEI) [27], and Standardized Precipitation Temperature Index (SPTI) [28]. Precipitation and temperature are two essential climatology indicators, revealing the vital dynamics of climate and hydrology. erefore, a standardized drought index based on these two meteorological variables is more bene cial for drought monitoring and forecasting. erefore, the SPTI has been chosen as SDI for monitoring and forecasting drought. e mathematical calculation of SPTI is quite similar to SPI; more detailed discussion can be accessed in [41]. SPTI is a multiscalar drought index and can be calculated for di erent time scales . Positive and negative values of the index  indicate drought and wet conditions. ese drought conditions are classified in Table 1 [62,63]. SPTI is a modified form of the De-Martonne Aridity Index (DAI) (de Martonne, 1926). e mathematical properties of SPTI are utterly similar to SPI, an extensively used index for drought prediction in many parts of the world. For SPTI, we need to calculate DAI based on the monthly total precipitation and average monthly temperature. e next step is to fit an appropriate distribution to calculate a cumulative probability for standardization. However, many researchers used Gamma distribution for standardization.
e index values are subjected to fitted distribution, and none of the single distribution can be appropriate for all the stations and for various time scales. erefore, the 32 candidate distributions have been fitted on DAI at different lead time scales. e Bayesian Information Criterion (BIC) has been used as a threshold to assess the appropriateness of a distribution.

Candidate
Algorithms. An artificial neural network (ANN) is a computational paradigm. It is a data-driven technique in which information goes through a biological structure of neurons with multiple layers introduced in the 1950s. It did not impose any constraints on input variables to train the model like other stochastic models. ese algorithms are brilliant and learn from existing relationships among the observations of input and auxiliary variables. ANN can manage high-dimensional and high-frequency complex datasets [58]. ANN algorithms have broad applications in mathematics, engineering, medicine, economics, neurology, and hydrology [64][65][66][67]. Kuligowski and Barros [68] claimed that weather prediction could be improved using ANN algorithms.
is class of algorithms can be helpful in the field of climatology to forecast natural hazards like drought. Multilayer perceptron (MLP) is considered one of the useful and fully connected feedforward artificial neural networks. It usually consists of three layers of multiple nodes, including an input layer, multiple hidden layers (usually two hidden layers with multiple hidden nodes) with a nonlinear activation function, and an output layer. e neuron structure and estimation accuracy of MLP make it prominent among the other ANN algorithms. Error backpropagation is one of the supervised learning techniques used to train MLP. Another stochastic algorithm used for drought prediction is the ARIMA process [69].

Seasonal Autoregressive and Integrated
Moving Average Model (SARIMA). Yule [70] pioneered to introduce autoregressive (AR) models that the time series being analyzed is a linear function of its previous lag values. Slutzky [71] modeled time series as a function of past residual terms named as moving average model (MA). Wold [72] merged both AR and MA specifications and introduced a new generalized form of ARMA specifications used to model all stationary time series by choosing the appropriate order of "AR" and "MA" terms into the model. Time series data generally have trends (non-stationary). Non-stationary time series can be modeled by appropriate differencing the series into stationary. e series that is transformed from nonstationary to stationary by differencing is known as integrated series. e ARIMA has a systematic way of identification, estimation, and diagnostic checking approach to reach an appropriate model. Many hydrologic and meteorological time series data have inherited seasonal components [73]. ese kinds of data can be efficiently modeled with the seasonal ARIMA model, which requires only a few parameters to be estimated [74]. e seasonal ARIMA model is described as ARIMA(p, d, q) (P, D, Q) s , where (p, d, q) is the non-seasonal component of the ARIMA specifications, while (P,D,Q) s is the seasonal component of the ARIMA model. e general seasonal ARIMA specifications are as follows: Here, "p" is the order of non-seasonal autoregressive terms, "q" is the no of non-seasonal MA terms to be included in the model. Similarly, "P" is the seasonal autoregressive terms, and "Q" is the number of seasonal MA terms, ∇ d is the difference operator of non-seasonal series with "d" levels to make the series stationary, ∇ D′ is the difference operator of seasonal series with D ′ no of differencing to get integrated stationary series, where "s" is the length of the season. e mathematical details of ARIMA specifications can be observed in [75]. e development of ARIMA specifications included identification, estimation, and diagnostic checking. By following these steps, a parsimonious ARIMA specification can be selected for estimation and forecasting a time series.

Extreme Learning Machine (ELM).
e extreme learning machine (ELM) is a modern single hidden layer feedforward neural network (SLFN) algorithm proposed by [76]. e proposed novel machine learning algorithm (ELM) operates identically to feedforward back-propagation ANN (FFBP-ANN) and least-squares support vector regression LSSVR models. It has shown its candidacy among the ANN algorithms to solve complex linear and nonlinear regression problems. It contains a single hidden layer of multiple hidden nods. However, most of the ANN-based methods have specific limitations such as slow computation, learning epochs, larger biases, and tuning parameters (weights). To overcome such weaknesses and the frailty of ANN methods, a state-of-the-art algorithm known as extreme learning machine (ELM) gained fame in the class of ANN algorithms [77].
Studies have revealed that even with randomly generated weights of hidden nodes, ELM can attain the universal approximation feature of SLFNs [46,47,78]. In the proposed method, the input weights are assigned randomly, and the output weights can be solved uniquely by the least-squares method of generalized inverse function [76]. If hidden node input weights and biases are chosen randomly, SLNFs can be considered a linear system. e output weights are determined analytically through the generalized inverse operation of the hidden layer output matrices because these weights connect the hidden layer to the output layer of the linear ELM attains optimal generalization performance as long as the chosen number of hidden nodes is su ciently high. In our simulation through ELM with sigmoid activation function, the number of hidden nodes is selected automatically to attain optimal prediction and forecast performance.

Model Evaluation Metrics.
A model performance assessment needs calibration of an existed link between observed and predicted hydrological patterns. e fundamental performance assessment method is through a visual inspection of empirical and predicted or forecasted time series. For the quantitative evaluation of algorithms, around 20 performance metrics select a hydrological model [82]. It has been observed that the choice of an appropriate model signi cantly changed if precision-based metrics were used instead of error-based metrics [83]. Numerous accuracy measurement criteria were developed, but each tool has inherited pros and cons, and none of the metrics is universally accepted and can be used as a threshold [84]. In this study, some error-based performance metrics have been used for cross-validation, including root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). e Kling-Gupta e ciency (KGE) and Willmott Index of agreement (WI) are also better ways to assess the performance of stochastic, machine learning, and hydrologic models [85,86]. Another way to assess an algorithm's prediction performance is to calculate the simple correlation coefficient between observed and predicted values of the input variable (SPTI) as a closeness measure. Similar performance metrics are used to assess the forecast ability of candidate algorithms. e RMSE is the deviation of estimated or predicted values "D" from actual or observed values "D" of drought indices, computed for "T" different predictions given in Since RMSE is positively affected by outliers, therefore we need some robust measures toward extreme values. Mean absolute error (MAE) is less influenced by extreme values than RMSE [87]. Equation (7) describes the mathematical structure of the MAE.
Another accuracy measure is the mean absolute percentage error (MAPE), a unit-free tool to assess an algorithm's prediction and forecast ability. Unlike other performance metrics, MAPE is a scaled independent metric. ese performance metrics or accuracy measures are extensively being used in the field of climatology. e mathematical form of MAPE is given in Kling-Gupta efficiency index was developed to assess the performance model by comparing estimated and observed time series data [88].
Here, "r," α, and β in the KGE index illustrate the correlation coefficient, standard deviation ratio, and average ratio of observed and predicted values of SPTI, respectively.
Willmott [89] proposes an index named Willmott Index of agreement (WI) as a standardized measure of the degree of model prediction error.
A model with minimum values of RMSE, MAE, MAPE, the maximum value of KGE index, and the value of WI close to "1" will be selected and proposed as an adequate algorithm for the estimation of existing drought phenomena and forecasting future drought episodes.

Results
e descriptive statistics of meteorological and climatic variables are briefly detailed by using five-number summary statistics. e numerical results related to Minimum (Min.), first quartile (Q 1 ), Median, Mean, and third quartile (Q 3 ) are expressed in Table 2. ese results indicated that the annual and seasonal meteorological characteristics of the selected meteorological stations are quite diverse.
Muzaffarabad has the highest mean monthly precipitation (125.65 mm), and the lowest mean monthly precipitation recorded was (15.35 mm) at Kalat. Sialkot has the highest maximum rainfall in a month (917.6 mm), and Chhor has observed the lowest maximum rainfall in a month (11.47 mm), while minimum rainfall at all selected locations was zero (0 mm). e precipitation source at these stations varies, such as heavy rainfall occurring at some stations in the monsoon season (June-Sep). However, precipitation exponentially declined after September and lasted till December until the western depression started in winter. At the same time, western depression causes rainfall in the winter season (Dec-Mar). e above statistics exhibit the dry and wet season cycles at a few stations. Temperature is another climatic variable used to calculate SPTI, so similar descriptive statistics for minimum and maximum temperature are expressed in Table 3. Results show that the total monthly minimum and maximum temperatures are highly apparent and distinguishable for all the stations.

Estimation of SPTI.
At the very early stage of the computational analysis, we first prepared time series data of the SPTI index for all the stations by following the guidelines [90]. As described in Section 2.2, CDFs of the appropriate probability functions are standardized for all the stations and  Table 4 shows the BIC values calculated by fitting all the candidate distributions on SPTI-1 at selected meteorological stations, and with the lowest value of BIC, a distribution is chosen as the appropriate fitted distribution.
Furthermore, in Sialkot, Muzaffarabad, and Kalat stations, the BIC values of the "four-parameter Beta" distribution are the lowest among other distributions (Sialkot, −722.18; Muzaffarabad, −584.52; Kalat, −664.62). Only in Chhor station, "Johnson S U -distribution" has given better fitness results (Chhor, −445.58). We have observed that the " ree Parameter Weibull" distribution with the lowest BIC   Table 4). For the Astore station, the histogram of the appropriate probability function, the associated quantile plot, and the temporal behavior of the standardized time series data of SPTI-1 are presented in Figure 3.
For ease of convenience, other station plots are skipped. e red spikes indicated the drought severity and conditional dependence structure among drought episodes. e selected probability distributions with respective BIC values for all the time scales (1, 3, 6, 9, and 12) are presented in Table 5. Finally, standardized time series data of SPTI for selected time scales at all the selected meteorological stations have been prepared using appropriate probability distributions.

ELM and Its Comparative Assessment.
In this research, we have assessed the performance of ELM with MLP and ARIMA in two phases. For all the individual stations selected for the current study, full-length data were divided into two independent parts, the training set and the validation set (test data). For most stations, precipitation and temperature records were available from 1951 to 2016. In the training phase, 64 years of monthly precipitation and minimum and maximum air temperature data from January 1951 to December 2013 are used to train candidate algorithms. In the testing phase, the rest of the 36 months of data (2014-2016) are considered test data to validate forecasting results. Simulations for ELM and MLP are carried out using R package nnfor [92], and forecast package for R language [93] Table 6. ese optimum speci cations were attained by running all possible ARIMA models, including all possible seasonal and non-seasonal lag values of the input time series. e ELM algorithm was trained using 23 input layer nodes and a single hidden layer with 100 hidden nodes. e algorithm is repeated 20 times, and the estimated outcomes are combined using the median operator. ELM assigns random weights to each hidden node. Furthermore, it assigns start weights to input layer nodes and generalized weights to hidden layer nodes. ese weights are estimated using least absolute shrinkage and selection operator (LASSO) to keep the model parsimonious.
e parametric network structure still forms a large dimension matrix, which is not feasible to illustrate      numerically in a tabulated form. Furthermore, MLP is trained using two hidden layers containing 10 and 5 hidden nodes, respectively, to get optimum results. e MLP algorithm drastically increases as we increase the number of hidden layers or by increasing the number of nodes of hidden layers. Hence, the structure of MLP isnalized with 23 nodes of the input layer, two hidden layers with 10 and 5 nodes, respectively, and a single output layer.
is parsimonious structure still forms a larger matrix of user-de ned parameters. Due to complexity and numeric hazard, the estimated results for user-de ned parameters are skipped. e performance of ELM, MLP, and ARIMA algorithms was assessed using performance assessment metrics, including RMSE, MAE, MAPE, KGE, and Willmott index of agreement. Table 7 provides numerical results of these performance assessment metrics using training data sets for ELM, MLP, and ARIMA models at selected meteorological stations with (1, 3, 6, 9, and 12) month lead time scales. Results indicate that ELM performs better than MLP and ARIMA models. For the assessment of candidate models, the numerical results of all statistical metrics for SPTI-1 are illustrated with  Similarly, the numeric quantities of WI at the Astore station are 0.999, 0.748, and 0.664, respectively. e KGE index indorses ELM's superior performance at all the stations by providing maximum values as compared to MLP and ARIMA. Similarly, Willmott's agreement "WI" index consistently provides the highest values for the ELM algorithm. KGE and WI are considered the most appropriate metrics for the performance assessment of meteorological and hydrological models. e KGE is calculated using the correlation coe cient, the ratio of variations, and the ratio of  16 Complexity averages of predicted and observed series using equation (9). e values of "WI" for the ELM model for all the selected stations are close to 1, which endorses the ELM as the best performing model. e similar superior performance of ELM continued for other time scales at selected stations. Overall results for the training phase show that the ELM model has shown good agreement at all selected stations.
As the time scale of the drought index increases, the performance of the proposed algorithm improves. As a result, the ELM algorithm showed superior performance to its competitive algorithms (MLP and ARIMA). A comparison of all the performance assessment metrics concluded that ELM algorithm is selected as the adequate model for the estimation and forecasting of drought indices (see Table 7). e consistency and co-movement of the observed and estimated values of SPTI are further assessed while employing Karl Pearson's product-moment correlation coe cient. Table 8 shows the numerical results of the correlation coe cient between the observed and predicted values of SPTI using ELM, MLP, and ARIMA models for training data. e quantitative results of the Astore station for ELM, MLP, and ARIMA are 0.87, 0.59, and 0.56, respectively, indicating a better agreement of ELM to predict SPTI-1 contrary to other candidate algorithms. At Chitral station, values of correlation are (0.93, 0.74, and 0.69). For SPTI-3 at Astore station, these results are 0.96, 0.88, and 0.86, respectively, and at Chitral station, the correlation values for ELM, MLP, and ARIMA are 0.97, 0.93, and 0.90.  Usually, climatic and meteorological studies comprise high-frequency datasets that require fast algorithms. So, speed is a notable characteristic for determining the reliability of the algorithm. e algorithm selection for climatic studies is subjective in terms of speed and relative e ciency.
ELM has the novelty of being the fastest algorithm among the ANN class to solve complex datasets. ELM algorithm training and testing time were almost 32 times faster than ANN, indicating ELM's supremacy over other ANN algorithms [58]. e functional relationship between the actual (observed) and predicted values of SPTI using ELM and other algorithms for the "Astore" station is shown in Figure 4 using a line graph, which depicts signi cantly less variation among the observed values of SPTI and the predicted values using ELM.
MLP and ARIMA were unable to capture all the shocks in historical values of SPTI, and departure from observed values was signi cant. Here, the ELM algorithm re ects  Scatter plots of the observed and predicted time series data sets are another way to assess the prediction performance of probabilistic, machine learning, and ANN algorithms. Figures 8-10 show the scatter plots of the historical observed and predicted values of SPTI-6 using ELM, MLP, and ARIMA models at all the selected meteorological stations. Predicted values through the ELM model have shown a strong correlation with the observed values of SPTI through scatter plots. Another graphical presentation endorses the superior performance of ELM. We can observe that the ELM algorithm showed more accuracy and can potentially predict drought conditions in any climatic zone. Figure 11 represents the Taylor diagrams for the Astore station with (1, 3, 6, 9, and 12) month time scales for the training phase. Taylor diagrams are the more comprehensive and precise way to represent the estimation and forecast ability of a model. e similarity between the predicted and actual values of SPTI is evaluated in terms of correlation (as a measure of closeness), and the variation is assessed by the standard deviation (SD) and the RMSE. For SPTI-1, the correlation of modeled data using the ELM algorithm with actual observations was about 0.9, followed by MLP and ARIMA with 0.6 each. As time scale increases, the prediction performance of algorithms signi cantly improves. For SPTI-3, the ELM algorithm is signi cantly closer to the actual values as its correlation is about (0.97) as compared to MLP (0.9) and ARIMA (0.85). e Taylor diagram exhibits the superior performance of ELM algorithm for estimating SPTI with (1, 3, 6, 9, and 12) month lead time scales (see Figure 11). Figure 12 illustrates the violin plots related to the training phase of ELM and other candidate models. e red dot represents the mean value, a thick white bar represents the interquartile, and a thin blue line represents the whole data set distribution.
ese are the components of the  and observed data are nearly identical. ese graphical illustrations a rmed that the proposed ELM model is better in estimating the actual values of SPTI at all selected meteorological stations. After calibrating and validating algorithms for training datasets, the proposed algorithm's generalization capability has been assessed in the next step. Finally, out of the sample forecast of SPTI for all the lead time scales, (1-12) is carried out for 36 months from 2014 to 2016. ese forecasts are considered su cient for drought preparedness and mitigation policies. Similar performance metrics have been used to analyze the di erence between the observed and forecasted values of SPTI for di erent time scales at all selected meteorological stations. Numerical results related to these performance metrics are given in Table 9. If the comparison of KGE was made at Astore station, the values of KGE for ELM, MLP, and ARIMA models are 0.988, 0.771, and 0.671, respectively. e KGE for the ELM model is signi cantly higher than its counterparts for all the time scales at all the selected meteorological stations, which clearly endorses the better forecast capability of the ELM model. Similarly, WI and MAE quantitative results also rati ed that the ELM model outperformed the ANN and ARIMA models. RMSE and MAPE endorsed ELM as a better forecasting model for most stations. e functional relationship between the observed and forecasted values of SPTI using ELM, MLP, and ARIMA models for all the time scales at Astore stations is illustrated in Figure 13 for the test phase starting from January 2014 to December 2016 for 36 months. A signi cantly smaller degree of deviation in SPTI for the ELM model and observed values of SPTI were exhibited. Although MLP has shown a 22 Complexity reasonably good forecast performance compared to the stochastic seasonal ARIMA model, the ELM model evidently shows superior forecast performance. In order to check the appropriateness of the ELM model, scatter plots were prepared using time series data of the observed and forecasted SPTI at Astore station for all lead time scales (Figure 14). ese scatter plots have shown a signi cant di erence in the forecast performance of ELM and other models. e scatter plots depict the correlation, goodness-of-t, and the extent of agreement between the observed and forecasted SPTI. e ELM model also clearly outperformed MLP and ARIMA models for the testing phase for all the selected time scales.

Discussion.
Drought is a multifaceted and commonly occurring hazard in several parts of the world. Its impacts are prevalent in the agriculture, socio-economic, and energy sectors. However, precise drought monitoring and estimation techniques can assist in decreasing the vulnerability of society to drought. e primary objective of the current study was to test the appropriateness and usefulness of the ELM model relative to other ANN (MLP) models and stochastic (ARIMA) models for drought forecasting. e prediction and forecast performance of ELM is compared with other ANN algorithms (MLP) and statistical stochastic (ARIMA) models. e prediction and forecast performance of models is assessed using numerous performance metrics, including RMSE, MAE, MAPE, KGE, WI, and Karl Pearson's correlation coe cient. e quantitative assessment revealed that both the ANN models (ELM and MLP) performed better than the stochastic model (ARIMA), and among the ANN models, ELM has shown supremacy by producing the smallest RMSE, MAE, and MAPE values and the maximum values for KGE, WI, and correlation coecient for almost all the meteorological stations.
Furthermore, ELM shows better agreement for both the training and test phases to predict the SPTI at all climatic stations than its counterparts. e e ciency of ELM, contrary to other models, is evident based on the performance metrics. A similar forecast performance has continued for higher-order time scales, consistent with earlier studies [30,58]. Computational time consumed by drought modeling algorithms also needs to be optimized. Usually, large datasets are used as input variables for real-time drought modeling, which a ects the computational performance of di erent models in terms of time.
e ELM model is signi cantly faster than its counterparts. By evaluating all the numerical results of performance metrics and di erent graphical illustrations, it can be easily concluded that the ELM model attains the most accurate drought forecasting performance during training and test phases. e study revealed that ELM is the most appropriate, reliable, and e cient algorithm for drought prediction and forecasting. is study suggests that ELM can be used as an early warning drought forecasting tool for developing drought mitigation policies. Observed ARIMA Figure 14: Scatter plots of the observed and forecasted values of SPTI using ELM, MLP, and ARIMA models at Astore station with all lead time scales for the test phase. e least-squares regression line is also included.