A Hybrid Approach by CEEMDAN-Improved PSO-LSTM Model for Network Traffic Prediction

As an important part of data management, network traffic evaluation and prediction can not only find network anomalies but also judge the future trends of the network. To predict network traffic more accurately, a novel hybrid model, integrating Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) with long short-term memory neural network (LSTM) optimized by the improved particle swarm optimization (IPSO) algorithm, is established for network traffic prediction. Firstly, an LSTM prediction model for the real-time mutation and dependence of network traffic is constructed, and the IPSO is applied to optimize the hyperparameters. ,en, CEEMDAN is introduced to decompose sequences of raw network traffic data into several different modal components containing different information to reduce the complexity of the network traffic sequence. Finally, the evaluation of the experiments shows the feasibility and effectiveness of the proposed method by comparing it with other deep neural architectures and regression models. ,e results show that the proposed model CEEMDAN-IPSO-LSTM produced a significantly superior performance with a reduction of the prediction error.

characteristics of network traffic, we establish the LSTM network traffic prediction model to extract the dynamic characteristics of network traffic. en, the IPSO is utilized for hypermeters optimization. In addition, the CEEMDAN method is employed to decompose the network traffic data into several simplified modes. Finally, we compare the prediction accuracy of different models to evaluate the prediction effects of the CEEMDAN-IPSO-LSTM neural network model. e main contributions of our work are presented as follows: (1) CEEMDAN is introduced to decompose network traffic data into several components, which creates modal confusion and avoids making larger impacts on the original signal during adding the white noise. e rest of this paper is as follows. In section 2, we review the related research about network traffic prediction. In section 3, we constructed a network traffic prediction model based on LSTM. In section 4, we study the hyperparameter optimization of LSTM by the Improved Particle Swarm Optimization, and network traffic data denoising by CEEMDAN. Section 5 presents our evaluation of the proposed method. Finally, the main conclusions and future work are drawn in section 6. and realize data flow decomposition [13]. In the network traffic model based on ARMA, the deployment of the multiscale fitting process can obtain high accuracy under any expiration delay, simplify the ARMA model, and enhance the integration effect of the ARMA framework in network traffic modeling [14].
However, ARMA is not suitable for long-term network traffic data with network anomalies because the premise of ARMA modeling is that the data analyzed is a stationary random process. Most of the actual network traffic data are nonstationary [15], which can be transformed into stationary data after finite-difference. erefore, some scholars proposed the Autoregressive Integrated Moving Average (ARIMA) model in the research process [16].

Nonparametric model.
Nonparametric model refers to a model with no fixed structure and fixed parameters. Common nonparametric models include Support Vector Machine (SVM), k-Nearest Neighbor (kNN) [8], Artificial Neural Network (ANN), etc. e nonparametric model can automatically fit a variety of function forms without assumption, and the training effect is good, which is suitable for predicting large data volume.
Due to the real-time variability and dependence of network traffic, traditional network traffic prediction models have some disadvantages such as weak generalization ability and limited prediction accuracy. erefore, more and more researchers use nonparametric models to predict network traffic data. e Support Vector Regression model (SVR) and its variant MK-SVR are first used to predict network traffic [17][18][19], which effectively predicts the changing trend of network traffic data but lacks the consideration of temporal correlation of time series data leading to a limit of prediction accuracy.
Methods based on the artificial neural network, such as Convolutional Neural Network (CNN) [20], improve the effect of flow classification by autonomous feature learning of data [21]. LSTM neural network and Gated Recurrent Unit (GRU) neural network have a superior effect over existing SVM and ANN models in predicting network traffic, which is more suitable for random nonlinear network traffic prediction [12]. LSTM neural network was originally used for short-term flow prediction, which can better learn the abstract representation of nonlinear flow data and capture the inherent characteristics of long-term dependence relationship in continuous data, thus improving the accuracy of flow prediction [22]. LSTM neural network is used for network traffic prediction, and the auto-correlation coefficient is added to the model to describe the trend of network traffic change better, which improves the accuracy of the prediction model [23]. On this basis, the improved Particle Filter (PF) algorithm is used to optimize the LSTM model, which improves the training rate and overcomes the shortcoming of convergence to local optimal in the traditional LSTM network [24]. e experiments of many neural network methods to predict the network traffic data show that in a real-time network data set, LSTM is of better performance than Recurrent Neural Network (RNN), the Feed-forward Neural Network (FFN), and other classic methods. LSTM neural network can more accurately simulate time series and its long-term dependencies than the traditional RNN, in large network traffic matrix prediction, and obtain a faster convergence rate [25]. e variants of LSTM neural network, GRU neural network, and identity-RNN (IRNN) have comparable performance with LSTM [26]. Minimal Gated Unit (MGU) overcomes the shortcoming of the high computing cost of the LSTM network and achieves relatively predictable performance with less model training time [27]. In addition, LSTM neural network has achieved good prediction results in financial data forecast [28,29], metal price prediction [30], air quality index prediction [31], modular temperature prediction [32], and bridge health monitoring [33].
In summary, a single parametric or nonparametric model has its problems and defects, while a hybrid prediction model can overcome the shortcomings of a single model by combining two or more models. e hybrid model mainly combines some decomposition algorithms, optimization algorithms, and prediction algorithms, respectively, in the data preprocessing, prediction, and result correction stage of network traffic prediction. Although combinatorial prediction has achieved good results in other researches [34,35], there are still some problems, such as how to choose the prediction model and its parameters, how to integrate the prediction results reasonably, and how to choose the appropriate decomposition algorithm or optimization algorithm. For network traffic prediction, using the combined prediction model and overcoming the above problems is a research direction worthy of further study.

Network Traffic Prediction Based on LSTM
3.1. LSTM Neural Network Model. LSTM neural network (hereinafter referred to as LSTM) is an improvement of the recurrent neural network, which aims to overcome the defects of the recurrent neural network in processing longterm memory [36]. e LSTM introduced the concept of cellular states, which determine which states should be preserved and which should be forgotten. e basic principle of LSTM is shown in Figure 1.
As shown in Figure 1, X t is the input at time t, h t-1 is the output of the hidden layer at time t-1, and C t-1 is the output of the historical information at time t-1; f, i, and, o are, respectively, the forgetting gate, input gate, and output gate at time t, and g is the internal hidden state, namely, the transformed new information. LSTM conducts parameter learning for them in the training. C t is the updated historical information at time t, and h t is the output of the hidden layer at time t.
Firstly, the input x t at time t and the output h t-1 of the hidden layer are copied into four copies, and different weights are randomly initialized for them, to calculate the forgetting gate f, input gate i, and output gate o, as well as the internal hidden state g. eir calculation methods are shown in formulas (1)-(4), where W is the parameter matrix from the input layer to the hidden layer, U is the self-recurrent parameter matrix from the hidden layer to the hidden layer, b is the bias parameter matrix, and σ is the sigmoid function, so that the output of the three gates remains between 0 and 1.
Secondly, forgetting gate f and input gate i are used to control how much historical information C t-1 is forgotten and how much new information g is saved, to update the internal memory cell state C t . e calculation method is shown in formula (5).
Finally, output gate o is used to control how much C t information of the internal memory unit is output to the implicit state h t , and its calculation method is shown in formula (6).

Network Traffic Prediction Model Based on LSTM.
Network traffic data are modeled as a nonnegative matrix X of an NxT, where N represents the number of nodes, T represents the number of time slots sampled, and each column in the data matrix represents the network traffic value at different nodes in a specific time interval. Network traffic prediction can obtain the predicted value of the future time through the historical time series, X (i, j) represents the scale of the NxT flow matrix, and x n,t represents the network traffic value of row n and column t. Network traffic prediction is defined by a series of historical network traffic data (x n,t-1 , x n,t-2 , x n,t-3 ,. . ., x n,t-1 ) to predict the network traffic at time t in the future. In the network traffic prediction model based on LSTM (Figure 2), it is assumed that the network traffic at a certain point in the tslot is predicted, the input of the model is (x n,t-1 , x n,t-2 , x n,t-3 ,. . ., x n,t-1 ), and the output is the predicted value x t of the network traffic at the t-slot at this point.
In Figure 2, we summarize the process of network traffic prediction based on LSTM, and it mainly includes network traffic data preparation, data preprocessing (data resampling and null filling), normalization of data, data classification, prediction network building, network compilation, network evaluation, data prediction, and evaluation. e detailed contents of each process for network traffic prediction are as follows: (1) Network traffic data preparation and preprocessing.
To meet the time and frequency requirements (second, minute, hour, day, etc.) of network traffic data prediction, the original data are required to resample, namely, the time series from one Security and Communication Networks 3 frequency is converted to another frequency. And to ensure even data time interval, the uneven time interval data are converted to equal interval data. ere are generally two methods of data resampling: downward sampling and upward sampling. e former is to convert high-frequency data into lowfrequency data, while the latter is to convert lowfrequency data into high-frequency data. In addition, if there is a void value in the resampled data sequence, it is necessary to fill the void value. e commonly used methods include the direct deletion method, statistically based filling method, and machine-learning-based filling method. e direct deletion method may discard some important information in the data, and the statistically based filling method ignores the timing information of the data [37]. erefore, this paper adopts the machine-learning-based filling method-K-Nearest Neighbor (KNN) to fill the void value of network traffic data.
(2) Data normalization. e range standardization method is used to process the network traffic data so that the sample data value is between 0 and 1. e calculation method of the range standardization method is shown in formula (7).
where X max represents the maximum value of network traffic data and X min represents the minimum value of network traffic data.   validation. Under the condition of keeping the network traffic data time sequence constant, the training set and the test set are divided by fivefold cross-validation [38], which are used for the training and prediction of the LSTM network traffic prediction model. (

Improved Particle Swarm Optimization. Particle Swarm
Optimization (PSO) is a simple-rule, fast-convergencespeed swarm intelligence optimization algorithm [39,40]. It regards every individual as a part with no size and no quality in an n-dimension search space, which flies at a certain speed. It improves the searching through group cooperation and competition among the particles under the guidance of swarm intelligence. Particle swarm optimization in n-dimensional continuous search space, for i-th (i � 1, 2,. . .,m) particle, determines that n dimensional current position vector . . x i n ] T represents the current position of the i-th particle in the search space, and n dimensional velocity represents the search direction of the particle. e optimal position (pbest) experienced by the i-th particle in the group is denoted as  (8) and (9).
In the PSO algorithm, ω keeps the particle moving inertia so that it tends to expand the search space, the ability to search new areas.
e ω value usually adopts the linear inertia weight method, that is, the ω value increases or decreases linearly with the number of iterations. Compared with the fixed ω value, the linear method improves the optimization ability and convergence speed of the PSO algorithm to some extent, but it is far from enough. e nonlinear inertia weight method can further improve the optimization ability and convergence speed of the PSO algorithm [41]. erefore, the ω calculation in this paper is improved by using the nonlinear inertia weight method, as shown in formula (10).
In formula (10), ω max and ω min , respectively, represent the maximum inertia weight and the minimum inertia weight, and i is the current iteration number. item_max is the maximum iteration number.
In the PSO algorithm, c 1 and c 2 are used to adjust the step size of particle movement. In this paper, the sine function is used to improve the acceleration constant [29]. e calculation method is shown in formulas (11) and (12).

LSTM Hyperparameter Optimization Based on Improved
PSO. e selection of hyperparameters of the LSTM prediction model has an important influence on prediction accuracy. e current hyperparameter selection method based on the empirical method has randomness, blindness, and nonuniversality in the parameter setting. erefore, multiple hyperparameters are formed into a multidimensional solution space, and the optimal parameter combination is obtained by traversing the solution space, which can reduce the randomness and blindness of parameter selection. Multiple hyperparameter selections are in a larger scope, which needs a better performance optimization algorithm to obtain the global optimal solution quickly, so we introduce the improved particle swarm algorithm (Improved PSO, IPSO) to optimize LSTM model parameters. With the quick convergence speed, the IPSO promotes the scientific nature of the model parameter selection and further improves the prediction accuracy of the models.
It is assumed that n hyperparameters of the LSTM network traffic prediction model are optimized, each particle represents a set of hyperparameters of solution space. It is Security and Communication Networks supposed in the n-dimensional continuous search space, there are m groups of hyperparameter combinations, representing the i-th (i � 1, 2,. . .,m) hyperparameter. e current position vector . v i n ] T of n dimension represents the search direction of this group of hyperparameters. e goal of network traffic prediction is to make the predicted value close to the actual value, that is, the error between the predicted value and the actual value is as small as possible. erefore, the Root Mean Square Error (RMSE) of training data in the network traffic prediction model is selected as the objective function. Let fitness � RMSE, then the objective function is to minimize RMSE. e RMSE calculation method is shown in formula (13).
Two important hyperparameters of the LSTM network traffic prediction model are optimized according to IPSO: time step size and the number of neurons in each layer. e single-layer and bilayer LSTM models are taken as the research objects to carry out the hyperparameter optimization. For the single-layer LSTM model, the node is for the number of neurons, and the lookback is for the time step, fitness � RMSE (node, lookback); for a bilayer LSTM model, fitness � RMSE (node1, node2, lookback).
According to the algorithm flow of IPSO, the process of IPSO optimized LSTM network traffic prediction model hyperparameter mainly includes six steps.
e IPSO parameter is set. e particle swarm size is set as the number of hyperparameter combinations m. Each particle is randomly set as the initial value and speed of each group of hyperparameters within the allowed range. e maximum number of iterations item_max and the prediction error Pre_error.
Step 2. e fitness of each particle is evaluated, that is, the fitness value of the objective function of each group of hyperparameters is calculated.
e optimal objective function value P i for each set of hyperparameters is set. For the i-th group hyperparameter, its current target function value current_fitness is compared with P i . If it is less than P i , then current_fitness is used as the best target function value P i for the ith group hyperparameter, namely, P i � current_fitness.
Step 4. e global optimal value P g . For the hyperparameter of i-th group, P is compared with P g . If it is less than P g , then P i is taken as the optimal value P g of the current group, namely, P g � P i Step 5.
e search direction and value of each set of hyperparameters are updated according to formulas (8) and (9).
Step 6. e termination conditions are checked. If the set condition (default error or the maximum number of iterations) is not met, step 2 is returned to continue execution.

Network Traffic Data Decomposition by CEEMDAN.
e empirical mode decomposition algorithm (EMD) is a data processing method commonly used for nonstationary time series signals [42]. It can decompose the nonstationary signals into a series of intrinsic mode function (IMF) components with different time scales. However, modal confusion exists in this method. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) algorithm improved the EMD algorithm by adding a set of white noise with equal size and opposite signs before decomposing data via the EMD [43]. e CEEMDAN both confuses modal confusion and also avoids making larger impacts on the original signal during adding the white noise. e main steps of CEEMDAN are as follows: (1) Add a group of Gaussian white noise sequence ε i (t) with opposite signs to the original sequence x (t), and obtain a new set of time series; (2) Decompose each time series via EMD in formula (15) and obtain n intrinsic mode functions components; where, c ij is the j-th modal component obtained by EMD decomposition after adding white noise for the i-th time. (3) Add different adaptive noises and repeat steps (14) and (15) for m times to obtain the set of m groups of intrinsic modal components (IMF), in which the last group is the trend term (Res); (4) Calculate the ensemble average of all components to obtain the final modal component group c i (t).
e process of network traffic prediction based on IPSO-LSTM combined with CEEMDAN is shown in Figure 3. e process of data decomposition and prediction includes three main steps.

Network Traffic Prediction Algorithm Based on CEEM-DAN-IPSO-LSTM.
According to the process of IPSO for hyperparameter optimization and data de-composition by CEEMDAN, based on the network traffic prediction steps of LSTM, the CEEMDAN-IPSO-LSTM network traffic prediction algorithm is obtained. e pseudo-code of the algorithm is shown in Algorithm 1. Algorithm 1 firstly prepares network traffic data and decomposes the raw data into several subsequences, and then divides each subsequence into a training set and a test set. en, it uses the IPSO-LSTM network traffic model to obtain the optimal parameter combination. Finally, the optimal parameters are substituted into the LSTM model to complete the prediction of each subsequence and output the network traffic prediction result by superposing subsequence prediction results.
CEEMDAN-IPSO-LSTM network traffic prediction algorithm contains three processes, the time complexity of data decomposition is O(k 2 ), k is the size of the predicted data set; the time complexity of hyperparameter optimization process is O(n!); and the time complexity of the  (1) Network traffic data preparation and preprocessing (2) Decompose the raw data into several different modal components and * obtain some subsequences of IMF1, IMF2, IMF3, . . ., IMFn (3) Divide each subsequence into a training set and a test set (4) Construct the LSTM network traffic prediction model. Set partial parameters and fix the number n of the optimized parameter (5) IPSO parameter initialization (particle swarm size m, solving space dimension d, the maximum number of iterations iter_max, learning factor φ 1 , φ 2 , weight ω) (6) Initialize the values of n-dimensional parameter combinations of m groups randomly in the solution space (7) Initialize the global optimal parameter combination gbest_parameters, the partial optimal parameter combination pbest_parameters and the best fitness function value Pg (8) While the end condition is False (9) Apply the n-dimensional parameter combinations of m groups, respectively, to the LSTM network traffic prediction model for training, and calculate the current fitness function value; (10) Get the current best fitness value Pi and the corresponding parameter combination pbest_parameters; (11) if Pi ＜ Pg; (12) P g � P i ;//Update the best fitness value (13) gbest_parameters � pbest_parameters;//Update the global optimal parameter combination (14) end if; (15) for each parameter combination (16) Calculate the search direction and position of the new parameter combination according to equations (8) and (9)  (17) Fix the updated parameter in the selected values; (18) end for; (19) e In the running process of the algorithm, the parameter optimization process consumes the most time with the highest computational complexity, but its time cost is acceptable because this process needs to run only once to obtain the optimal combination of hyperparameters. Once the hyperparameters are determined, the main time complexity is reflected in the prediction process. e time of the prediction process is mainly spent in the training. As long as the training is completed, the prediction can be finished by substituting the input data into the equation.

Experimental Environment Configuration and Parameter
Setting.
is experiment completed under the measured flow data of BC-Oct89Ext provided by Bell Laboratory is selected. e flow data were Ethernet data detected in e Bell Morristown study, containing one million packets. is paper selects some data segments of BC-Oct89Ext flow data for model analysis.
For the prediction results of the network model, three error analysis indicators were used to verify the prediction accuracy, which were Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), respectively. MAE and MAPE calculation methods are shown in equations (18) and (19).
According to Equation (13), the smaller the RMSE value, the smaller the average error between the prediction results and the actual data, the higher the prediction accuracy of the model, and the better the prediction performance of the model. Similarly, it can be seen from equations (18) and (19) that the MAE and MAPE values tend to 0, the better the prediction effect of the model is and the more perfect the model is. On the contrary, the greater the value is, the greater the error is, and the worse the prediction effect of the model is.

Data Processing
(1) Data resampling. As the original network traffic data in BC-OCT89Ext were collected multiple times per second with unequal time intervals, the data collected multiple times per second were preprocessed with the mean value method, and then the K-Nearest Neighbor (KNN) algorithm was used to fill the void value. Figure 4 shows 1800 pieces of flow data after packet resampling and null value processing.
(2) Data decomposition. It can be seen from Figure 5 that network traffic data have obvious nonlinearity and nonstationarity, which makes prediction difficult. en the original time series is decomposed by the CEEMDAN method into several more predictable time subseries, and six groups of modal subsequences were obtained from high frequency to low frequency. Decomposition results are shown in Figure 5. It can be seen that the fluctuation of IMF1 to Res subsequence gradually flattens out and the frequency becomes lower and lower.
(3) Data division. e data after normalization was divided into a training set and a test set according to simple crossvalidation. e first 80% of the data were used as training data for LSTM network model training. e remaining 20% of the data were used as prediction data to verify the efficiency of the model.

Network Traffic Prediction Based on Basic LSTM
(1) Network definition. In this forecast, the network structures of three-layer LSTM (one input layer, one hidden layer, and one output layer) and four-layer LSTM (one input layer, two hidden layers, and one output layer) are, respectively, adopted. e specific connection mode of the three-layer LSTM is as follows: the timesteps of LSTM in the first layer are 1. e input of the data dimension is 3, and the number of neurons is 64. e second layer hidden layer (dense) takes the output of the first layer LSTM as input; the output layer of the third layer takes the output of the second hidden layer as the input and connects to a full connection layer. A one-dimensional vector with a length of 360 output from the full connection layer is the final output result, which represents the value of the predicted future 360 data points. To prevent overfitting, a dropout layer was added between the first layer and the hidden layer for regularization. After many tests in this experiment, it was concluded that when the dropout is 0.3, the training set had the highest accuracy.
Compared with the three-layer LSTM network, a hidden layer is added to the four-layer LSTM network structure. e hidden layer uses the results of the first layer as the input for training and transmits its output to the next hidden layer. e number of neurons is the same as that of the first layer. e dropout � 0.3 layer is added in both the first and second layers to prevent overfitting.
(2) Network compilation. LSTM network compilation uses the adaptive moment estimation (Adam) algorithm as the optimizer and the mean square error loss function as the objective function.
(3) Network fitness. e LSTM network was trained on 1440 samples and 360 samples were used for testing. e number of iterations epochs equals 50, look_back is made of 1, 5, and 10, respectively, and batch_size equals 128.
(4) Network evaluation. When look_back takes 1, 5, and 10, respectively, and the number of hidden layers (LN) is 1 and 2, respectively, the loss data of the model training process is shown in Figure 6.
It can be seen from Table 1    and double-layer network, respectively, and corresponding error values were calculated. e test results are shown in Figure 8.
It can be seen from Figure 8 that the setting of the number of hidden layers and the time step has a great impact on the fitting effect of LSTM. When a hidden layer is added, the prediction error changes, and the increase or decrease of the prediction error is not fixed at different timesteps. When the time step is changed, that is, the look_back value is changed from small to large, and the trend of prediction error is also not fixed. For example, when the look_back value changes from 5 to 10, the prediction error of the singlelayer LSTM model decreases, while the prediction error of the double-layer LSTM model increases.   erefore, for network traffic data, the prediction effect of the parameter combination set by the empirical method is unstable and cannot achieve the optimal prediction performance. erefore, the Improved Particle Swarm Optimization (IPSO) will be adopted to carry out model optimization, that is, the intelligent algorithm will be used to efficiently obtain the parameter combination with the optimal prediction effect.

Parameter Optimization of LSTM Network Traffic Prediction Model Based on IPSO.
e IPSO algorithm was used to optimize the LSTM network traffic prediction model, and parameters were optimized for single-layer LSTM and double-layer LSTM, respectively. e fitness value of the LSTM prediction model changed as the number of iterations increased during the experiment, as shown in Figure 9.
It can be seen from Figure 9 that the final convergence value of fitness12 is less than fitness23 and fitness22, the convergence rate is faster than fitness23, and the fitness22 final convergence value is only slightly smaller than the fitness of 23. is shows that for the long-term prediction of network traffic data if the fitness value from a single hidden layer LSTM optimized by the particle swarm algorithm is slightly smaller than that from a two-layer hidden layer   Security and Communication Networks 13 LSTM optimized by the particle swarm algorithm, convergence speed is faster. It can be seen that compared with the empirical method of setting LSTM parameters, the RMSE of the IPSO for setting LSTM parameters is reduced by 20%, which means that the IPSO algorithm can effectively find the optimal parameter combination of LSTM network traffic prediction and reduce the prediction error.
In addition, Figure 10 shows the changes in node number and time step size during the IPSO-LSTM12 model optimization that shows the process of the optimal parameter value of the LSTM network traffic model determined by the improved PSO algorithm.
It can be seen from Figure 10 that the optimal parameters of the LSTM12 model are set as node1 � 8 and look back � 1. erefore, in the prediction of network traffic data used in this paper, the optimal configuration of the single-layer LSTM model is to set the number of neurons to 8 and the time step to 1.
e changes in node number and time step size in IPSO-LSTM23 model optimization are shown in Figure 11.
It can be seen from Figure 11 that the optimal parameters of LSTM23 model are set as node1 � 16, node2 � 4, and look back � 1. erefore, in the prediction of network traffic data used in this paper, the optimal configuration of the two-layer LSTM model is to set the number of network neurons in the first layer to 16, the number of neurons in the second layer to 4, and the time step length to 1.
To evaluate the prediction performance of the LSTM model after parameter optimization by IPSO, network traffic data samples at 180 time points are used for verification. In this paper, the IPSO optimized single-layer LSTM IPSO-LSTM12, double parameter optimization model IPSO -LSTM22 of double-layer LSTM, three parameters optimization model IPSO LSTM23-1 (no dropout in training) of double-layer LSTM, three-parameter optimization model of IPSO LSTM23-2 (dropout in training) of double-layer LSTM are compared, and Figure 12 shows the model prediction results for the last 180 test data.
It can be seen from Figure 12 that the prediction results of the LSTM model with different parameter combinations have a good fitting effect, and the prediction results of the single-layer LSTM dual-parameter optimization model IPSO-LSTM12 are better than those of other parameter configuration models. To compare the predictive performance of the four models more clearly, the predictive performance evaluation index values of the four models in Figure 12 are obtained, respectively, and the results are shown in Table 2.
As it can be seen from Table 2, compared with singlelayer LSTM12, two-layer LSTM22 has slightly fewer prediction errors in RMSE and MAE, while MAPE is slightly bigger. If only RMSE or MAE evaluation index is considered, LSTM22 is better than LSTM12, while only MAPE evaluation indicators are considered, LSTM12 is considered better than LSTM22. On the whole, the prediction error of LSTM12 and LSTM22 is less than that of the other three prediction models, that is, the prediction effect of LSTM12 and LSTM22 on network traffic data is better than that of the other three models. e prediction error of LSTM23-2 is less than that of LSTM23-1, which indicates that the optimization of dropout parameters added in the training reduces the prediction error of the model and improves the prediction performance of the model.

Network Traffic Prediction Based on CEEMDAN-IPSO-LSTM.
rough testing on a 500-time data set, the predictive performance of each IMF is shown in Figure 13.       Figure 13 shows the prediction results and training loss of eight IMFs and it has a better prediction effect. IMF0 and IMF7 are a little poor, in which the loss of the training set is very high during the whole training process. Especially, the loss of IMF0 is relatively large. For the remaining IMFs, LSTM predicts them well. Despite this problem, the overall results were excellent when the predictions were integrated. After finishing predicting all IMFs, the final prediction result is integrated by superimposing the predicted results of each IMF. Figure 14 shows the forecasting flowchart of CEEMDAN-IPSO-LSTM.

Result Analysis.
To evaluate the prediction effect of the proposed hybrid method CEEMDAN-IPSO-LSTM, it is compared with other neural network prediction methods like CEEMDAN-LSTM, IPSO-LSTM, and LSTM, and other predictive models like ARIMA, Support Vector Regression (SVR), Decision Tree Regressor (DTR), and Multivariate Linear Regression (MLR). Similarly, the network traffic data samples at 180-time points were used for verification, and the prediction results of the eight models are shown in Figure 15. Figure 15 shows that the prediction effects of different models and the hybrid prediction model have a better fitting effect which indicates that the prediction results of the CEEMDAN-IPSO-LSTM model are better than those of other models. To compare the prediction performance of the eight models more clearly, their predictive performance evaluation index values were obtained, respectively, and shown in Table 3.
It can be seen from Table 3, that the prediction errors of the LSTM-based model are all less than regression prediction models, which indicates that the LSTM network traffic prediction model has a better prediction effect than other regression network traffic prediction models. In other words, the LSTM is more suitable for solving long-term network traffic data prediction and processing real-time variability of network traffic data. In addition, the RMSE, MAE, and MAPE index values of the CEEMDAN-IPSO-LSTM prediction model are all smaller than other neural network prediction models, indicating that the proposed hybrid model CEEMDAN-IPSO-LSTM is better than other prediction models in network traffic prediction.
Besides, we make comparisons of decomposition methods like EMD, EEMD, and CEEMDAN. Firstly, based on a 500-time network flow data, we decompose the original data into several IMFs and compare the decomposition results of three decomposition methods. en, we make predictions by LSTM methods combining the three decomposition methods to explain which method works better.
In Figure 16, there are seven IMFs of EMD, eight of EEMD, and eight of CEEMDAN, including residue. We only know that different decomposition results make the prediction accuracy different, but it is hard to see which one produces the better prediction. So, we make predictions by LSTM combining the three decomposition methods and the results are in Figure 17. Figure 17 shows the red lines fit the raw data more closely, which shows that the predicted result of CEEM-DAN-LSTM is closer to the real value. To further verify the effect of different decomposition methods, Table 4 gives the prediction error of CEEDAN-LSTM, EEMD-LSTM, and EMD-LSTM. In Table 4, the prediction error of CEEMDAN-LSTM is significantly less than the other two methods, which indicates CEEMDAN can decompose data more effectively so that LSTM can predict better. at is to say, the results verify the superiority of CEEMDAN for data decomposition.
Also, based on the same 500-time network flow data, we compare CEEMDAN-IPSO-LSTM with s three state-of-theart prediction models to verify the effectiveness of the proposed network traffic prediction model, like ST-LSTM, SA-ARIMA-BPNN, and INGARCH. e last 100-time prediction data of the four methods are in Figure 18.
In Figure 18, the four prediction methods do a good job of forecasting network traffic. Figure 18 shows that the purple and green lines match the raw data represented by the blue lines better, which demonstrates that the proposed method and the SA-ARIMA-LSTM make more effective predictions close to reality. To compare the prediction accuracy of the four methods more clearly, the prediction error of the four methods is calculated similarly and shown in Table 5.
In the same appearance as Figure 18, CEEMDAN-LSTM has the lowest prediction error. e appearances of Figure 18 and Table 5 prove the superiority of the CEEMDAN-IPSO-LSTM in this paper once again.
Above all, the CEEMDAN-IPSO-LSTM has a better prediction effect and higher reliability for the future prediction of network traffic.

Conclusion and Future Work
Network traffic prediction can be applied to network resource optimization and network congestion avoidance, which makes great significance for network business planning, data management, fault detection, resource allocation, and other operations. In this paper, a hybrid deep interval prediction model has been proposed for network traffic forecasting to improve the prediction accuracy. Firstly, the nonparametric LSTM neural network is used to establish the network traffic prediction model, and the Improved Particle Swarm Optimization algorithm is used to optimize the hyperparameters of the established LSTM prediction model, and further obtain the optimized LSTM network prediction model-IPSO-LSTM12,IPSO-LSTM23 and IPSO-LSTM32-which reduces the RMSE by 20% compared to the Experience-based LSTM. Besides, the prediction performance of single-layer LSTM is better than double-layer LSTM in network traffic prediction. en the CEEMDAN is introduced to decompose the network traffic time series into different modes to reduce the complexity of the network traffic sequence. To verify the effectiveness of the proposed models, the proposed CEEMDAN-IPSO-LSTM model is applied to network traffic prediction and compared with other neural network prediction methods and regression methods. e experimental results show that compared with other prediction models and the traditional LSTM model, the CEEMDAN-IPSO-LSTM model reduces the prediction error and obtains a better fitting effect, which demonstrates that the proposed hybrid method improves network traffic prediction accuracy.
In future work, we plan to enhance the prediction model from two aspects to further improve the prediction accuracy of network traffic. On the one hand, in the data preprocessing stage, we will try other data decomposition methods, such as Variational Mode Decomposition (VMD), wavelet packet, and combination method, to improve the stability and regularity of network traffic data decomposition. On the other hand, we will focus more on the error correction strategy of the hybrid model of network traffic forecasting, such as analysis of different error correction strategies, or re-decompose the IMF data, to enhance the prediction performance.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest.