Development of a Three-Stage Hybrid Model by Utilizing a Two-Stage Signal Decomposition Methodology and Machine Learning Approach to Predict Monthly Runoff at Swat River Basin, Pakistan

Laboratory for Operation and Control of Cascaded Hydropower Station, China ree Gorges University, Yichang 44302, China Center of Excellence in Water Resources Engineering, University of Engineering and Technology, Lahore 54000, Pakistan College of Hydraulic & Environmental Engineering, China ree Gorges University, Yichang 44302, China China College of Environmental Science and Engineering, Hunan University, Changsha 410082, China


Introduction
Managing water resources is crucial from several aspects such as the development of future water bodies, efficient exploitation of hydropower for power generation and irrigation purposes, to prevent disputes, and for the protection of existing water bodies from overexploitation and pollution [1,2]. Hydrological runoff prediction is an important area in managing water resources, and accurate runoff planning and prediction can prove to be a feasible measure. Runoff and rainfall prediction depend on nonlinear factors including precipitation, uneven flow, topography, anthropic activities, and evaporation, which make the runoff prediction task a great challenge [3,4].
Process-driven and data-driven are two approaches for runoff prediction. e data-driven approaches are becoming more popular for extracting accurate predictions with their speedy growth, increasing power of computation, and fewer information needs than the process-driven approaches [5]. Conversely, these artificial intelligence-(AI-) based models exhibit drawbacks and shortcomings like the sensitivity of SVM to parameter selection and overfitting problems faced by artificial neural networks (ANNs) [6]. Furthermore, the preprocessing of input and/or output data is a requirement to make these models to handle nonstationary data [7]. e direct use of the input signal for modeling in AI-based models may not deliver acceptable results; however, the model performance can be improved by applying a preprocessing technique [8]. Appropriate data preprocessing techniques are required to eliminate noise and extract trend from hydrological time series [9].
To overcome the inadequacies of the data-driven models and for more reliable and accurate predictions, hybrid models have been proposed to apply for the prediction of the hydrological time series [10,11]. Hybrid techniques by employing the data-driven approaches can obtain inclusive and correct information of the different parameters with an additional benefit of improvement in the prediction accuracy. Moreover, these techniques can detect periodicity, volatility, and the random nature of runoff processes [12].
Statistical models are widely employed for modeling of runoff time series [13]. e main disadvantages of these models are the requirements of stationarity and the linearity of the runoff data. ey also require a specific time series data length for a robust prediction result [14]. erefore, the modeling competency of statistical models is limited due to their linear nature for prediction of runoff time series, which exhibits a highly nonlinear and nonstationary nature [15,16]. Machine learning (ML) models are appropriate for nonstationarity and nonlinearity exhibited by the runoff time series, overcome the constraints of the statistical time series models, and achieve better performance and accuracy than the conventional statistical models for the time series [17,18]. e comprehensive evaluation of machine learning techniques is provided in [19,20] and [21]. SVM and MLP are the popular ML techniques in the field of runoff prediction. SVM is a very efficient and robust algorithm with applications in numerous runoff prediction studies due to its better performance. Moreover, SVM offers excellent simplification ability and promising results compared with other machine learning methods for hydrological runoff prediction [22][23][24][25].
SVM does not experience the problem of localized minimization and requires less time for the computation compared to ANN; therefore, there are fewer chances of overfitting and poor prediction results compared to ANN [26][27][28]. SVM obtains the best cooperation between learning capability and the model complexity, based on limited model information, to obtain the best result [29,30]. Furthermore, global optimization can be used to improve the parameters of SVM which results in better prediction performance than ANN [31]. e MLP represents an advanced version of ANN and is popular among hydrologists as compared to other ANNs [32]. Many research works exist which used MLP for runoff prediction [33][34][35][36]. e previous research as mentioned above highlights the superior performance of SVM for runoff prediction; therefore, in the present study, SVM has been chosen as the final stage to accomplish the task of runoff prediction in the present study.
In the field of runoff prediction, hybrid methods coupled with ML techniques were proposed to improve the prediction accuracy and to obtain better management of data [37,38]. Hybrid ML methods in the runoff prediction field offer advantages of automated and timely performance evaluation and management of the ensemble algorithms [32]. e research in [39,40] presents a review of the applications of the hybrid ML methods for runoff-rainfall prediction.
Decomposition techniques can be applied as data preprocessing tools to study the nonlinear and nonstationary characteristics of runoff series, such as ensemble empirical mode decomposition (EEMD) and variational mode decomposition (VMD) [17,41,42]. Time series decomposition finds successful implementation in improving the performance of ML methods used for runoff modeling. e decomposition method decomposes the original time series into several individual components; afterward, the ML models are employed for prediction purposes [17]. e components obtained as a result of employing an effective decomposition method are much easier to evaluate than an original time series [43]. e hydrological time series has been analyzed by many researchers by employing wavelet transform (WT) due to its excellent performance in conditions with multiple resolutions in frequency and time domains [44,45]. e WT represents a Fourier transform with an adjustable window requiring a stable signal in the WT window. Consequently, WT is prone to the restrictions of the Fourier transform. Although WT provides high resolution in both the time domain and the frequency domain, some false harmonic waves are produced during WT due to certain limitations of this method. erefore, the selection of WT basis functions is crucial due to its significant influence on the process of the wavelet decomposition. Empirical mode decomposition (EMD) was proposed to overcome the limitations of the WT [46]. EMD decomposes the trend components or the multiscale fluctuation in the signal to smoothen the signal and generates intrinsic mode functions along with a residual. EMD technique reflects a more accurate representation of the nonlinearity and nonstationary in the original series compared to the WT technique. erefore, EMD is considered as a more efficient way to process complex signals than the WT. e hydrological time series in classical hydrology can be considered as a combination of random, periodic, and trend components. e high-frequency and low-frequency components along with the residual obtained through EMD technique in the case of perfect decomposition can be approximated as random and periodic components along with the trend [47,48]. EMD finds successful implementations in the hydrological research [49,50], but EMD exhibits problems such as the mode mixing of IMFs and the orthogonality effect which affects the precision of prediction and the performance of EMD. erefore, EEMD was developed to resolve the issues and to lessen the impact of mode mixing as faced by EMD [51,52]. Nevertheless, the limitations of mode mixing of some signals and the end effect still exist in EEMD-based techniques [53]. Complete ensemble empirical mode decomposition with additive noise (CEEMDAN) is an advanced technique that overcomes the issues faced by EMD and EEMD like mode mixing and computational complexity, respectively. It is possible to achieve the reconstruction error close to zero by utilizing the CEEMDAN technique and by requiring fewer integration times, with the addition of adaptive white noise at each step [43]. However, CEEMDAN is also unable to completely resolve the issue faced by EEMD, such as the presence of residual noise in the modes and appearance of signal information later than in EEMD with specious modes in the initial decomposition stages [54].
VMD is another adaptive and nonrecursive signal analysis technique which, unlike empirical mode techniques, decomposes the original series into multiple modes and updates them [55]. VMD is more robust to noise and sampling, with outstanding performance in frequency search and separation. VMD can improve the mode mixing problem and extract the time-frequency features precisely by yielding narrow-banded modes [56]. e VMD is a comparatively new technique for hydrological application [17], and relatively, a few research works exist regarding applications of VMD for runoff prediction.
A single-layer hybrid model consisting of a decomposition technique and machine learning method is one of the most frequently employed methods to analyze the time series. ese hybrid models consisted of a single-layer decomposition technique that can enhance the predictive performance of nonlinear time series to some extent but unable to completely predict the nonlinearity and nonstationarity of the original signals. Consequently, the hybrid model based on two-layer decomposition methodology is employed to overcome the limitations of the single-layer decomposition technique [43]. erefore, this study proposes a three-stage hybrid model based on CEEMDAN, VMD, and SVM and its applicability to the runoff time series. e first decomposition stage employs the CEEM-DAN technique and decomposes the runoff series into random, periodic, and trend components intending to improve the prediction of nonlinear and nonstationary monthly runoff series. e VMD is proposed as an additional decomposition technique to diminish stochastic behaviors, noise, and trends of the data. Finally, the SVM algorithm predicts the monthly runoff data series. e main objectives of the present research are as follows: (1) e development of ML and the signal decomposition-based hybrid model by taking the hydrological runoff data of the Swat River, Pakistan. e rest of the paper is arranged as follows. e modeling techniques and the proposed approach are described in Section 2. e results and discussion are presented in Section 3, while Section 4 concludes the research work. e research will be useful for prediction and planning purposes and will provide new directions in the field of hydrology.

Decomposition Techniques
2.1.1. CEEMDAN. In CEEMDAN, the information regarding noise is shared between all workers as opposed to EEMD to efficiently solve the mode mixing issue of EMD [57]. e CEEMDAN technique enables us to get near to zero reconstruction error by adding a finite number of adaptive white noises at every stage through a lesser average number of integration times. is enables the CEEMDAN to avoid the mode mixing and the computational complexity issues [43]. e CEEMDAN process proceeds as follows: Step 1: create the original time series with added noise: (1) Step 2: use CEEMDAN to get the first IMF for each x i (t) and take the average: Step 2 is similar to EMD.
Step 3: CEEMDAN gets second and the remaining IMFs by decomposing the residual with the added noise as shown in the following: where E 1 (·) represents the first IMF decomposed from the original signal. Similarly, the k-th IMF and the residual can be calculated as Step 4: CEEMDAN obtains numerous IMFs and computes the residual as shown in the following: Discrete Dynamics in Nature and Society 3 Based on VMD, this study introduces a second decomposition of IMF1, owing to the unpredictability and the highest frequency of IMF1.

Variational Mode Decomposition.
VMD is a quasiorthogonal and adaptive decomposition method, where the modes are obtained nonrecursively [55]. It approximates the corresponding modes concurrently and determines the relevant band adaptively [53]. e VMD can be expressed as [55] f � min where X k : X 1 , X 2 , . . . , X k and ω k : ω 1 , ω 2 , . . . , ω k represent expressions related to all modes along with their central frequency. Furthermore, δ(t) and × represent Dirac distribution and convolution, respectively. e term of the quadratic penalty and Lagrangian multipliers are used to convert this constrained optimization problem to an unconstrained one [58]: e above equation can be resolved using different approaches, and the two stages of the equation are given as follows: (i) X k minimization: (ii) ω k minimization: where n denotes the number of iterations and VMD technique relies on the three fundamental concepts including wiener filtering, frequency mixing and heterodyne demodulation, and analytic signal and Hilbert transform. e original signal is decomposed into IMFs that reproduce the original signal with different sparsity features. In contrast to the original decomposition techniques, VMD relies on the alternate direction method of multipliers (ADMM) for reconstruction process [53]. VMD utilizes the principle of the variational mode to obtain the IMFs, thereby minimizing the sum of estimated bandwidth of each IMF, which makes this technique different from EMD. e bandwidth and center frequency of the IMFs are revised in the course of solving the variational model. e frequency domain of the signal results in adaptive segmentation of the signal band, and additionally, the IMF obtained has a narrow band [31]. e number of intrinsic modes defines correctly resolved data for an acceptable prediction model of a time series; therefore, the determination of the number of intrinsic modes is vital in the VMD process. e specification of the original time series dataset is impossible to be given if less intrinsic mode components than the required one are chosen. On the contrary, the intrinsic modes in excess may result in the poor performance that causes error accumulation by each prediction unit in the accumulation stage [59]. Nevertheless, the IMFs generated by the VMD process are usually evener compared to the mode functions obtained by other techniques like EEMD and wavelet transforms [60].
is lessens the accumulation of error over time. Another important aspect of VMD is the selection of several parameters, which requires trial and error methods [3].

Support Vector
Machine. SVM is a nonlinear search algorithm [61] used to minimize the expected errors in ML and to reduce the issue of overfitting [62]. Based on the training, by considering the past data, SVM predicts a forward quantity in time [32]. In the case of properly determined kernel filters and support vectors, SVM performs more efficiently than ANNs [24]. SVM works by constructing a hyperplane to enable the maximum sorting between the samples and to minimize the sample to the hyperplane distance [31].
SVMs were developed for the binary classification but are also applicable for the regression problems by introducing a loss function. e SVM algorithm only deals with linear problems. In the case of a nonlinear system, a nonlinear mapping is used to map the input vector x into the high-dimensional feature space z; afterward, the linear regression is performed in this space. In case of the radial basis function [63], Support vector regression (SVR) is used to apply SVM [24]. SVR based on the structural risk minimization theory and Vapnik-Chervonenkis dimension model is a feasible method to deal with prediction problems [26,64]. Equation (13) gives the standard form of the SVR model: e coefficients w and b are predictable by decreasing the risk function R(C).
ree parameters dominate the accuracy of the SVR network when the quality and span of the training samples are fixed: ε is the epsilon and controls the width of the epsilon tube in the training loss function, σ controls the width of kernel Gaussian function, and C is the regularization parameter and controls the risk degree of SVR empirically [65][66][67].

Multilayer Perceptron.
e MLP is the most widely used type of ANN for modeling the hydrological runoff data [68]. MLP belongs to feedforward neural networks. MLP can approximate both integrable and continuous functions. MLP consists of neurons arranged in layers in the form of groups. e input nodes in MLP are all in one layer, while the hidden portion has one or more hidden layers. e selection of layers is dependent on the problem being considered, and there are no specific rules for the selection of these layers [69]. Many algorithms [70][71][72][73] have been proposed for finding the optimum structure of the network, but the optimal solution of parameters is not guaranteed by any of these methods. Figure 1 illustrates a simple structure of the MLP network.
In MLP, the nodes of the input layer denote the length of the input data, while the neurons of the output layer show the length of the output data [74]. e calculations in the MLP network are performed successively from the input layer to the output layer. e calculation of the node is performed at the same time, which is present at the same level, and there is no interference of nodes during the process with each other [75]. e weighted sum of all nodes in the preceding layer is equal to the value of each node. e following formula can be used to calculate the value of each node in MLP: where f is the activation function and W i denotes the weight vector. e value vector of all neurons in the i − 1 layer is x ij shows the j neuron value in the i layer, and bias of layer i − 1 is represented by b i−1 . e linear and nonlinear functions are the most widely used activation functions. MLP is essentially a single-layer perceptron in the case of a linear activation function. e most commonly used nonlinear activation is the sigmoid function [74]. Equation (3) describes the sigmoid activations as follows: e expression of loss function of the actual value and the ideal output can be given as where z denotes the actual value, the output value is given by h, and distance norm is shown by ‖ · ‖ e backpropagation (BP) algorithm is usually used to adjust the parameters of MLP and serves to minimize the loss function. e gradient descent (GD) algorithm is the simplest and commonly employed parameter adjustment algorithm. e stochastic gradient descent (SGD) algorithm is another useful algorithm to adjust the parameters of MLP [74]. e SGD algorithm performs well for the optimization process; however, it exhibits a slow convergence rate. Additionally, the chance exists for the gradient descent to experience the issue of loss of the function's saddle point [76]. Several alternative approaches exist to address these issues and for updating the parameters of the neural network. ese adaptive approaches diagonally scale the gradient through an approximation of the curvature of function [77]. e most widely used optimizer in deep learning is Adam (adaptive moment estimation) which can be chosen as the best optimizer for nonstationary objectives without the need of other optimizers [78].

Quantitative Performance Indices.
In this research, several statistical indices are used to assess the performance of the observed and the predicted runoff data. e root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), mean squared error (MSE), and the coefficient of determination (R 2 ) were applied to evaluate the reliability of the predicted model.

RMSE.
RMSE is used to measure the deviation between the predicted and the observed value, and to represent the extent of dispersion of a dataset, the smaller value represents the better performance of the algorithm:

MSE.
It represents an average of squares of the difference between the predicted and the observed value, and the smaller value represents the better performance of the algorithm:

Coefficient of Determination (R 2 )
. R 2 summarizes the error by evaluating the linear correlation between the observed and the predicted data, with values ranging between 0 and 100%: In expressions (11)- (18), N i and O i denote the ith predicted and actual values of runoff, respectively, whereas n shows the total number of predictions.

Proposed Hybrid
Modelling. As explained above, hydrological runoff exhibits nonstationary and nonlinear characteristics [79,80]. ese properties of runoff result in the undesirable performance of many prediction models, along with poor generalization due to the requirements of many pseudo-variations, which also affects the accurate knowledge of data variations [81]. erefore, this paper proposes a three-stage hybrid model by coupling the ML method with signal decomposition techniques for a reliable and accurate runoff prediction. e flowcharts representing the main steps of the proposed methodology and seven other models developed for comparison with the proposed CVS model are given in Figure 2. Furthermore, the main steps of the CVS model are explained as follows: Step 1: the Pearson correlation coefficient method was applied to the original runoff time series and its lagged values to determine the appropriate input variable, and the time-lagged series having the highest value of the correlation coefficient with the original runoff series was chosen as an initial input for the decomposition model Step 2: CEEMDAN technique was applied to the lagged runoff time series obtained as a result of Step 1, which decomposes the series into subcomponents (IMFs and residual) having a different frequency Step 3: the component of high pass (IMF1) produced by CEEMDAN was further decomposed by VMD Step 4: SVM algorithm was applied to construct a prediction model for the whole dataset containing extracted IMF components and the runoff data signal to make a prediction for each component accordingly Step 5: to produce a collective output, the predicted results of all extracted IMFs obtained by the SVM algorithm were reconstructed to produce the final prediction result of the original runoff series Step 6: finally, the statistical performance metrics evaluated the results in the training and testing periods

Case Study.
e present research considers the runoff data of the Swat River basin collected from the Water and Power Development Authority (WAPDA), Pakistan, for the prediction purpose. e Swat River is a perennial river located in the northern part of Khyber-Pakhtunkhwa Province, Pakistan (Figure 3). It originates from Hindu Kush mountains and flows through the Kalam valley to Madyan and lower areas of Swat valley up to Chakdara. e river outflows into the Kabul river and has a total length of 240 km.
e Swat River serves the purpose of irrigation, power generation, and a natural habitat for fishes and birds. e catchment area of the river is generally hilly, with altitudes ranging from 360 m to 4,500 m approximately, from south to north. e location of the catchment of the Swat River basin is between longitude 70°59′ east to 72°47′ east and latitude of 34°00′ north to 35°56′ north [82].

Data Selection.
e monthly runoff data of the Swat River from 1961 to 2015 were taken at Chakdara hydrological station in the Swat River catchment. e data are available on a daily basis, and to obtain monthly data, the average monthly data were calculated from the daily data. e monthly runoff data series is shown in Figure 4 and was selected for prediction.
For developing the CVS hybrid model, runoff data are divided into training (approximately 80% of the whole dataset) and testing datasets (approximately 20% of the whole dataset) to predict a 1-month-ahead runoff.  Discrete Dynamics in Nature and Society e research work was carried out using a 64-bit Windows 10 operating system on a 3.70 GHz, Intel (R) Core i7-10510U CPU with 16 GB memory. e analyses were performed using Matlab R2015a software and Python 3.6 relying on pandas and NumPy packages. e optimal parameters were selected after different trials and errors, considering the best results. SVM and MLP networks were developed with Keras using Google Tensorflow backend. MLP network in all MLP-based models was developed with two hidden layers having 64 and 32 hidden neurons, respectively, with sigmoid activation function, while the output layer has 1 neuron to predict runoff. Moreover, different learning rates were selected for each MLP-based model. Due to the nonstationary and noisy nature of the runoff time series, we applied the adaptive moment estimation (Adam) optimizer for efficient stochastic optimization [83].   For SVR-based models, the radial basis function (RBF) was selected as a kernel for all models with different values of C and σ for each model. In the case of CEEMDAN, the standard deviation of noise was selected as 0.2, the number of realizations allowed was chosen as 500, while the allowed maximum number of sifting iterations was taken as 5000.
e values of the different parameters of CEEMDAN are taken from [57], and the same reference explains the detailed procedure of the parameter selection for CEEMDAN. e selected parameters for VMD include moderate bandwidth constraint, alpha � 2000; uniform initialization of omegas, init � 1; criterion for the tolerance of convergence, tol � 1e-7; and noise tolerance, tau � 0, while the value for the number of modes, K was chosen through correlation analysis of the frequency modes generated by CEEMDAN.

Analysis.
In developing ML-based hydrological models, the selection of suitable input variables is one of the most important steps [84]. e autocorrelation function (ACF) determines an appropriate input dataset for the model corresponding to the runoff at the output by applying a lag time to the original runoff time series [3,17,85]. erefore, to determine a suitable input dataset for the hybrid model in the present study, an ACF was applied to the runoff time series by applying a monthly time lag for a year. As evident from Figure 5, Q 12 shows the highest value of correlation; therefore, the Q 12 dataset was selected as an input for runoff prediction.
By employing the CEEMDAN as a preprocessing technique, the selected runoff data series after a time lag (input signal) was decomposed into a sequence of eight independent IMFs and a residual, i.e., eight quasi-stable components and one trend component are obtained due to the decomposition of a nonstationary runoff data series ( Figure 6). e denoising process of the time series is not required since CEEMDAN has good antinoise features [43]. It is evident that the IMF1 component has the highest frequency and shows strong nonlinearity and significant fluctuations. However, the remaining IMFs (IMF2∼IMF8) and the residual indicate a stable and regular fluctuation which shows a gradual reduction in the frequency with an increase in the wavelength. e secondary decomposition of IMF1 was carried out by VMD due to the presence of high oscillatory fluctuations in IMF1. e trial and error method was used to select several parameters in the VMD technique [3]. e value of the K parameter can also be determined in ensemble decomposition techniques by correlation analysis [31]. In the    Discrete Dynamics in Nature and Society present study, we will also obtain K value by correlation analysis of the intrinsic modes produced by CEEMDAN. e correlation coefficient between the input signal and the IMFs including the residual was calculated (Table 1). e third IMF shows a strong correlation with the input signal, and IMF3 was considered as a borderline for the selection of IMFs as values of K in VMD. IMFs 1 and 2 showed less correlation than IMF3 and were considered as one value of K for VMD decomposition, while the remaining IMFs 3-9 including the residual were taken as seven values of K. Hence, we obtained the value of K � 8 for VMD. Figure 7 shows the decomposition results of IMF1 by VMD.  200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200  300  400  500  600  100   200 300 400 500 600 100 Discrete Dynamics in Nature and Society e VMD produces smoother intrinsic modes compared with other decomposition techniques [60] which is also verified by the decomposition result of IMF1 (Figure 7). e decomposed time series components obtained after applying CEEMDAN and VMD along with the runoff data series were applied as an input to SVR for training and validation of data. SVR was used to predict VF1-VF8; afterward, the prediction results of IMF1 were combined with IMFs (IMF2-IMF8 and residual) produced by CEEMDAN to obtain the prediction results of the runoff time series of Swat River. e performance of the CVS model during training and testing periods is evaluated and compared with CEEMDAN-VMD-MLP, CEEMDAN-SVM, VMD-SVM, CEEMDAN-MLP, VMD-MLP, VMD, and MLP models to verify the effectiveness of the proposed model. e results are presented in Figures 8-14 and Tables 2 and 3.
Boxplots (Figures 12 and 13) indicate the range of quartile-based predicted and original (observed) runoff, while whiskers show the variability from the exterior of the 25 th to 75 th percentiles. e testing phase indicates more skewness and dispersion in prediction compared with the training phase. From the above figures showing training and testing results, it is evident that the CVS model simulates well with stable behavior than all the other models, indicating the superior capability of the CVS model in nonlinear runoff modeling. Moreover, the CVS model can mimic the runoff well than the other models in both training and testing phases, and overall, the hybrid approach   performance is better than the individual models. Furthermore, the CVS model shows better prediction in the training period compared with the testing period. As per the results (Tables 2 and 3) Figure 14.
e correlations amongst the original and the predicted runoffs for standalone models (SVM and MLP) are lowest than the hybrid models ( Figure 14). e CVS model shows the highest correlation for training (R 2 � 0.9856) and testing (R 2 � 0.9804) periods, while MLP showed the lowest correlation during training (R 2 � 0.8263) and testing (R 2 � 0.8050) periods. e performance of standalone  e results of figures and tables highlight that the threelayer CEEMDAN-VMD-MLP model also performs better than two-layer models (CEEMDAN-MLP and VMD-MLP); however, its performance is inferior to the two-stage VMD-SVM hybrid model in error reduction and accuracy. Furthermore, all the hybrid models show better performance than the standalone models with direct prediction. e results highlight that the decomposition-based ensemble models are better than standalone ML models since the decomposition approach decomposes the complex input signal into simple-to-study subcomponents, which are favorable to predict and analyze. It can also be concluded (Tables 2 and 3) that the VMD technique is superior to the CEEMDAN technique since the relevant VMD-based   Figure 15 shows the performance of different models in predicting the extreme values of observed runoff during the training and testing periods.

Extreme Value Analysis.
e three-stage hybrid models show a superior capability to predict the extreme values of runoff, compared with all the other models, during the training and testing periods. e CVS model shows the best prediction results, while the MLP model in predicting the extreme values of runoff shows the poorest results. Furthermore, the two-stage hybrid models also perform relatively better than the standalone models. All the models show better performance   Albeit the runoff process is a complex task for prediction, all the hybrid models generally performed well in all simulations. e results prove the findings of [86][87][88] according to which, it is impossible practically for a single model to predict precisely the complex hydrological runoff due to the effects of the external factors. e superiority of the proposed hybrid approach proves the viability of decomposition and ML-coupled hybrid approach for hydrological prediction and can also provide a feasible practical reference for similar prediction tasks. e CVS model can identify the intricate nonlinear connection between the original runoff data and the prediction with the best accuracy and performance. Nevertheless, the performance of the model is highly dependent on the reliability of hydrological time  Comparison of the coefficient of determination 14 Discrete Dynamics in Nature and Society series data, mode selection by VMD, and hyperparameter selection by the SVM algorithm. is study deals with monthly runoff prediction by utilizing decomposition-based ML models. However, it is also essential to explore the performances of the CEEMDAN-VMD-based ML models on a daily, weekly, and annual basis for effective management of river basin, reservoir operation and planning, and allocation of water resources. Furthermore, the segregation of hydrological data in normal, drought, and wet periods and maneuver over the performance of models in each period also provides an effective approach for runoff prediction. Moreover, this study has the limitation that it considers only runoff as a predictor for runoff modeling without considering the runoff factors (groundwater flow, surface, and subsurface factors), hydrophysical factors (infiltration, evaporation, etc.), and factors due to humans. Consequently, the authors suggest the implementation of advanced techniques in the future to deal with the limitations of the existing study in a more useful way for reliable hydrological runoff studies. It is expected that this research will provide new directions to study hydrological time series prediction, which will be useful for scientific and technical communities.

Conclusions
is paper proposes a three-stage hybrid prediction model, linking the robustness of CEEMDAN-VMD with the SVM algorithm to enhance prediction accuracy with the lowest prediction error for the hydrological runoff time series. Five hybrid models and two standalone models were also used as a benchmark comparison. e models were developed by taking the runoff data of the Swat River, Pakistan. Four statistical performance assessment measures were employed to assess the performance of various models. Considering the results of the prediction accuracy and the error reduction, the following can be concluded from this research work, regarding runoff time series prediction: (i) ree-stage hybrid models (CVS and CEEMDAN-VMD-MLP) coupling a two-stage signal decomposition methodology (CEEMDAN-VMD) with ML techniques (MLP and SVM) perform better than the two-stage hybrid (CEEMDAN-SVM, VMD-SVM, CEEMDAN-MLP, and VMD-MLP) and standalone models (MLP and SVM). (ii) CVS model showed superior performance than all the other models in training and testing periods. e suspicions and projecting inaccuracies associated with the proposed CVS model were relatively less than the other models, which endorse the significance of the proposed model for the hydrological runoff prediction. (iii) Two-stage hybrid models combining single-stage signal decomposition methodology with ML techniques exhibit superior performance than standalone models. (iv) ML techniques (SVM and MLP) are applicable to predict the runoff time series, and the SVM algorithm is superior to MLP. (v) Both signal decomposition techniques (VMD and CEEMDAN) significantly improved the prediction results, showing that both techniques apply to the complex, noisy, and nonstationary runoff time series. VMD has shown better performance than CEEMDAN in all cases. (vi) Limitations: the quantity and quality of the available data play a significant role in the prediction task, and it is not easy to meet this requirement. ML techniques are sensitive to parameter and Discrete Dynamics in Nature and Society hyperparameter selection. Furthermore, ML techniques lack physical relations and concepts, which add complexity in the structuring of ML models. (vii) Significance and future directions: the research is vital to manage the study area with higher-order trends and noises. e error criteria determine the results of the performance evaluation, and the study judged the performance by employing the wellknown performance measures; the superior results indicate the suitability of the CVS model for prediction purposes. Moreover, all the hybrid models showed better performance than standalone models. erefore, hybrid models combining decomposition techniques with ML methods can play a role in the forthcoming prediction studies. e financial, societal, and ecological benefits of the precise runoff prediction sound for further enhancements in the runoff prediction; therefore, future research will consider new approaches based on deep learning models to study the nonlinear connections among runoff, temperature, climate condition, and precipitation.
Data Availability e runoff data of Swat River, Pakistan, used to support the findings of this study are included within the article. e data are also available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.