A Network Traffic Prediction Model Based on Quantum-Behaved Particle Swarm Optimization Algorithm and Fuzzy Wavelet Neural Network

Due to the fact that the fluctuation of network traffic is affected by various factors, accurate prediction of network traffic is regarded as a challenging task of the time series prediction process. For this purpose, a novel prediction method of network traffic based on QPSO algorithm and fuzzywavelet neural network is proposed in this paper. Firstly, quantum-behaved particle swarmoptimization (QPSO) was introduced. Then, the structure and operation algorithms of WFNN are presented. The parameters of fuzzy wavelet neural network were optimized by QPSO algorithm. Finally, the QPSO-FWNN could be used in prediction of network traffic simulation successfully and evaluate the performance of different prediction models such as BP neural network, RBF neural network, fuzzy neural network, and FWNN-GA neural network. Simulation results show that QPSO-FWNN has a better precision and stability in calculation. At the same time, the QPSO-FWNN also has better generalization ability, and it has a broad prospect on application.


Introduction
With the rapid development of computer network technology, network applications have infiltrated every corner of human society and play an important role in various industries and situations.Since the network topology structure is gradually complicated, the problem of network's emergencies and congestion are more and more serious.Through monitoring and accuracy prediction of network traffic, it can prevent network congestion and can effectively improve the utilization rate of the network [1].
In general, the network traffic data is a kind of time series data and the problem of network traffic prediction is to forecast future network traffic rate variations as precisely as possible based on the measured history.The traditional prediction model, such as Markov model [2], ARMA (Autoregressive Moving Average) model [3], ARIMA (Autoregressive Integrated Moving Average) model [4], and FARIMA (Fractional Autoregressive Integrated Moving Average) [5] model, has been proposed.As the network traffic is affected by many factors, the network traffic time series show quite obvious multiscale, long-range dependence, and nonline characteristic.The methods mentioned above have the weakness of lowlevel efficiency [6].
An artificial neural network (ANN) is an analysis paradigm that is roughly modeled after the massively parallel structure of the brain.Artificial neural networks can be thought of as "black box" devices that accept inputs and produce outputs and are able to give better performance in dealing with the nonlinear relationships between the output and the input theoretically [7].Although artificial neural networks have been successfully used for modeling complex nonlinear systems and predicting signals for a wide range of engineering applications, artificial neural networks (ANNs) have limited ability to characterize local features, such as discontinuities in curvature, jumps in value or other edges [8].These local features, which are located in time and/or frequency, typically embody important process-critical information such as aberrant process modes or faults.
The fuzzy neural networks (FNN) are the hybrid systems which combine both advantages of the fuzzy systems and artificial neural networks.The FNN possesses the merits of the low-level learning and computational power of neural networks, and the high-level human knowledge representation and thinking of fuzzy theory [9].A fuzzy wavelet neural network (FWNN) is a new network structure that combines wavelet theory with fuzzy logic and NNs.The synthesis of a fuzzy wavelet neural inference system includes the determination of the optimal definitions of the premise and the consequent part of fuzzy IF-THEN rules [10].However, many fuzzy neural network models, including FWNN, have common problems derived from their fundamental algorithm [11].For example, the design process for FNN and FWNN combined tapped delays with the backpropagation (BP) algorithm to solve the dynamic mapping problems [12].Unfortunately, the BP training algorithm has some inherent defects [13,14], such as low learning speed, existence of local minima, and difficulty in choosing the proper size of network to suit a given problem.Thus the systems which employ basic fuzzy inference theory make the degree of each rule extremely small and often make it underflow when the dimension of the task is large.In such a situation, the learning and inference cannot be carried out correctly.
As a variant of PSO, quantum-behaved particle swarm optimization (QPSO) is a novel optimization algorithm inspired by the fundamental theory of particle swarm and features of quantum mechanics such as the use of Schrödinger equation and potential field distribution [15].As a global optimization algorithm, the QPSO can seek many local minima and thus increase the likelihood of finding the global minimum.This advantage of the QPSO can be applied to neural networks to optimize the topology and/or weight parameters [16].
In order to predict the network traffic more accurately, a prediction model of network traffic based on QPSO algorithm and fuzzy wavelet neural network is proposed in this paper.The network traffic data is trained by QPSO and fuzzy wavelet neural network and weights are progressively updated until the convergence criterion is satisfied.The objective function to be minimized by the QPSO algorithm is the predicted error function.
The rest of this paper is arranged as follows.Section 2 gives a brief introduction to classical PSO algorithm and quantum-behaved particle swarm optimization (QPSO) algorithm.In Section 3, the fuzzy wavelet neural network is introduced and the fuzzy wavelet neural network based on QPSO (QPSO-FWNN) algorithm is presented in detail.In Section 4, simulation results are presented.Performance metrics of the several prediction methods are analyzed and compared in Section 5. Finally, some conclusions are given in Section 6.

Quantum-Behaved Particle Swarm Optimization
2.1.Classical Particle Swarm Optimization.Particle swarm optimization (PSO) is an evolutionary computation technique that is proposed by Kennedy and Eberhart in 1995 [17].Similarly to other genetic algorithms (GA), PSO is initialized with a population of random solutions.However, it is unlike GA, PSO does not have operators, such as crossover and mutation.In the PSO algorithm, each potential solution, called "particles," moves around in a multidimensional search space with a velocity constantly updated by the particle's own experience and the experience of the particle's neighbors or the experience of the whole swarm [18].In the PSO, each particle keeps track of its coordinates in the search space which are associated with the best solution it has achieved so far and this value is called .Another best value that is tracked by the global version of the particle swarm optimizer is the overall best value, and its location, obtained so far by any particle in the population [19].This location is called .
The process for implementing the global version of PSO is given by the following steps.
Step 1. Initialize a population (array) of particles with random positions and velocities in the D-dimensional problem space.For a D-dimensional problem with  number of particles, the position vector    and velocity vector    are represented as where  = 1, 2, . . ., .
Step 2. For each particle, evaluate the desired optimization fitness function in  variables.
Step 3. Compare each particle's fitness evaluation with the particle's pbest.If the current value is better than pbest, then set the pbest value equal to the current value and the pbest location equal to the current location in D-dimensional space.
Step 4. Compare the fitness evaluation with the population's overall previous best.If the current value is better than , then reset  to the current particle's array index and value.
Step 5. Update the velocity and position of the particle according to (2) and (3), respectively.One has where  1 and  2 are two positive constants, known as the cognitive and social coefficients, which control the relative proportion of cognition and social interaction, respectively, and the values of  1 and  2 were decreased with each iteration [20]. 1 and  2 are two random values in the range [0, 1].  , ,   , , and   , are the velocity, position, and the personal best of th particle in th dimension for the th iteration, respectively.The    is the th dimension of best particle in the swarm for the th iteration.
Step 6. Loop to Step 2 until a stop criterion is met, usually a sufficiently good fitness or a maximum number of iteration generations.

Quantum-Behaved Particle Swarm Optimization.
Motivated by concepts in quantum mechanics and particle swarm optimization, Sun et al. proposed a new version of PSO, quantum-behaved particle swarm optimization (QPSO) [21].In the QPSO, the state of a particle is depicted by a wave function (, ), instead of position and velocity.The probability density function of the particle's position is |(, )| 2 in position   [22].
Assume that, at iteration , particle  moves in dimensional space with a  potential well centered at   , on the th dimension.The wave function at iteration  + 1 is given by the following equation: where   , is the standard deviation of the double exponential distribution, varying with iteration number .Hence the probability density function  is defined as and the probability distribution function  is given by the following equation: ) .
By using Monte-Carlo method, the th component of position   at iteration +1 can obtain by the following equation: where  +1 , is a uniform random number in the interval [0, 1].The value of  +1 , is calculated as where parameter  is known as the contraction-expansion (CE) coefficient, which can be tuned to control the convergence speed of the algorithms [23].  is the mean best position () and is defined as where  is the size of the population.Hence the position of the particle is updated according to the following equation: ) .(10) From ( 4) and (10), the new position of the particle is calculated as )  ∈ (0, 0.5] , (11) where  is a random number in the range [0, 1]. is linearly decreasing factor from 1.0 to 0.3 with iteration as where  max is the maximum number iteration used in algorithm.

Fuzzy Wavelet Neural Network Based on QPSO
3.1.The Wavelet Base Function.In  2 (), a wavelet dictionary is constructed by dilating and translating from a wavelet base function () of zero average [24]: which is dilated with a scale parameter  and translated by 3.2.Fuzzy Wavelet Neural Network.The basic architecture of fuzzy wavelet neural network could be described as a set of Takagi-Sugeno models.Assume that there are  rules in the rule base and the Takagi-Sugeno fuzzy if-then rules are usually in the following form: where  1 ,  2 , . . .,   are input of T-S rule,    is the th linguistic variable value of the th input, which is a fuzzy set characterized by wavelet function.   is constant coefficients which are usually referred to as consequent parameters determined during the training process.
Figure 1 shows the architecture of the proposed FWNN modeling.The FWNN is a 4-layer feedforward network and detailed descriptions and equations for each layer are given here.Layer 1 (input variables layer).This layer is the input signals of the FWNN and each node of this layer, respectively, represents an input linguistic variable.The node output and the node input are related by where  (1)   and  (1)   are, respectively, the input and output of th node in Layer 1.
Layer 3 (rule layer).In this layer, the number of rules is equal to the number of nodes.The output can be calculated as follows according to the AND (min) operation [12]: Layer 4 (output layer).This layer consists of output nodes.The output are given by To train the parameters of FWNN, backpropagation (BP) training algorithm is extensively used as a powerful training method which can be applied to the forward network architecture [25].For this purpose, mean square error (MSE) is selected as performance index which is given by where  and  are current and desired output values of network, correspondingly.The all adjustable parameters of FWNN can be calculated by the following formulas:  (3)   ( + 1) =  (3)   () −  ⋅   (3)   () ( + 1) =  (2)   () −  ⋅   (2)   () ( + 1) =  (2)   () −  ⋅   (2)   () where  represents the backward step number and  and  are the learning and the momentum constants, differing in the ranges 0.01 to 0.1 and 0.1 to 0.9, respectively.

Fuzzy Wavelet Neural Network
Trained by QPSO Algorithm.Computational intelligence has gained popularity in training of neural networks because of their ability to find a global solution in a multidimensional search space.The QPSO algorithm is a global algorithm, which has a strong ability to find global optimistic results and QPSO algorithm has proven to have advantages than the classical PSO due its less control parameters [26].Therefore, by combining the QPSO with the fuzzy wavelet neural network, a new algorithm referred to as QPSO-FWNN algorithm is formulated in this paper.
When QPSO algorithm is used to train the FWNN model, a decision vector represents a particular group of network parameters including the connection weight, the dilation and translation parameter.It is further denoted as   = ( 1,1 , . . .,  1, , . . .,  , ,  1,1 , . . .,  1, , . . .,  , ,  1,1 , . . .,  1, , . . .,  , ) , (24) where   and   are the dilation and translation parameter of wavelet function in Layer 2.  , are the connection weight in (20).Since a component of the position corresponds to a network parameter, FWNN is structured according the particle's position vector.Training the corresponding network by inputting the training samples, we can obtain an error value computed by (22).In a word, the mean square error is adopted as the objective function to be minimized in FWNN based on QPSO.
The specific procedure for the QPSO-FWNN algorithm can be summarized as follows.
Step 1. Define the structure of the FWNN according to the input and output sample.
Step 2. Treat the position vector of each particle as a group of network parameter by (24).
Step 3. Initialize the population by randomly generating the position vector   of each particle and set   =   .
Step 5. Conclude the objective function of each particle by (22).
Step 6 (update ).Each particle's current fitness value is compared with previous best value .If the current value is better than the  value, then set the  value to the current value.
Step 7 (update ).Determine the swarm best  as minimum of all the particles .
Step 8. Judge the stopping criteria, if the maximal iterative times are met, stop the iteration, and the positions of particles are the optimal solution.Otherwise, the procedure is repeated from Step 4.

Simulation Results
Experimental data set consists of 10 hours observations which comes from monitoring the traffic between clients in our campus network and servers.The minimal time interval in network traffic time series is 10 seconds.Figure 2 shows the normalization of network traffic time series.
In this paper, the design of a discrete filter predictor consists in finding the relation between the future data () and the past observations ( − 1), ( − 2), . . ., ( − ), where  is the number of considered input elements.The predictor relationship can be described by the following convolution sum [27]: where ℎ  () ( = 1, 2, . . ., ) is the filter coefficient vector.( − ) is an -step backward sample.x() is the desired output.
In order to test the performance of the prediction model, the front 1000 is the training data, and the latter 2100 is the prediction data.The number of time series windows was set as 3, which meant that the forth of measurement data would be predicted from the past three of measurement data.In the established prediction model base on QPSO-FWNN algorithm, the number of membership functions is five, the number of input variables layer nodes is three, and the number of output layer nodes is one.The population size of QPSO algorithm is 120 particles and -dimensional search space of particle is 45. max is 30. max is 1 and  min is 0.5.The CE coefficient decreases linearly from 1.0 to 0.3 during the search process according to (12) where  is the dilation parameter of the wavelet function.
is the translation parameter of wavelet function. is the connection weight of the output layer.Figure 3 shows the membership functions of input variable (), ( − 1), and ( − 2) in FWNN units.Figure 4 shows the QPSO-FWNN convergence curves.These prediction results show that the QPSO-FWNN model is an effective, high-accuracy prediction model of network traffic in Figure 5.

Performance Metrics
In order to evaluate the prediction model more comprehensively, the following performance metrics are used to estimate the prediction accuracy.
(1) MSE (mean square error) is a scale-dependent metric which quantifies the difference between the predicted values and the actual values of the quantity being predicted by computing the average sum of squared errors [28]: where   is the actual value, ŷ is the predicted value, and  is the total number of predictions.
(2) NMSE (Normalized Mean Square Error) is defined as where  2 denotes the variance of the actual values during the prediction interval and is given as follows: where  is the mean value and is given by It can be seen that, if NMSE = 0, the prediction performance is perfect, and if NMSE = 1, the prediction is a trivial predictor which statistically predicts the mean of the actual value.If NMSE > 1, it means that the prediction performance is worse than that of trivial predictor [29].
(3) MAPE (Mean Absolute Percentage Error) can calculate the prediction error as a percentage of the actual value.MAPE is defined as (4) Coefficient correlation is the covariance of the two variables divided by the product of their individual standard deviations.It is a normalized measurement of how the two variables are linearly related.The coefficient of correlation () is given as follows: where   and  Ŷ indicate the standard deviation of the actual and the predicted values are given by ( 33) COV(, ) is the covariance between  and .It is obtained as follows: Values for the correlation coefficient range are there is a perfect positive correlation between the actual and the predicted values, whereas  = −1 indicates a perfect negative correlation.If  = 0, we have a complete lack of correlation among the datasets.
(5) Coefficient of efficiency (CE) is defined as where the domain of the efficiency coefficient is (−∞, 1].
If CE = 1, there is a perfect fit between the observed and the predicted data.When the prediction corresponds to estimating the mean of the actual values, CE = 0.If CE ∈ (−∞, 0], it indicates that the average of the actual values is a better predictor than the analyzed prediction method.The closer CE is to 1, the more accurate the prediction is.
In order to test the QPSO-FWNN method's validity and accuracy, we carried out the experiment which is compared with the other methods.The prediction model based on BP   1, one can look further into the prediction performance among the four prediction models.By comparing the value of MSE, NMSE, MAPE, coefficient correlation, and CE, QPSO-FWNN demonstrates better prediction accuracy than the other three methods.Therefore, the experimental results in this section show that the prediction method based on QPSO-FWNN is much more effective than BP, RBF, FNN, FWNN-GA, and ARIMA.It can be seen that the prediction method based on QPSO-FWNN is a better method to predict the time series of network traffic.In order to test the prediction stability of each mode, the five prediction methods were predicted 10 times, respectively.Figures 11,12,and 13 show that prediction results with QPSO-FWNN models are much more stable than BP, FNN, and FWNN-GA.

Conclusion
Predicting the direction of movements of network traffic is important as they enable us to detect potential network traffic jam spots.Since the network traffic is affected by many factors, the data of network traffic have the volatility and self-similarity features and the network traffic prediction becomes a challenge task.In this paper, the QPSO-FWNN method has been presented to predict the network traffic.The QPSO-FWNN combines the QPSO, which has the merit of powerful global exploration capability, with FWNN which can extract the mapping relations between the input and output data.The parameters of FWNN neural network are obtained by quantum-behaved particle swarm optimization (QPSO) and the time series of network traffic data was modeled by QPSO-FWNN.Finally, experiments showed that QPSO-FWNN model has faster and better performance in  The network traffic

𝑗𝑖
is the th membership function of   and  = 1, 2, . . ., .    ,    are the dilation and translation parameter of wavelet function.

Figure 2 :
Figure 2: The normalization of network traffic time series.
The error between the actual value and its predicted value
The error between the actual value and its predicted value

Figure 6 :
Figure 6: Prediction results with BP neural network.

Figure 11 :
Figure 11: MAPE and NMSE histogram of four prediction models.

Figure 12 :
Figure 12: MSE and CE histogram of four prediction models.
and After 50 times iterations, the cost function  of QPSO-FWNN neural network was 2.1887.The connection weight between input variables layer and output layer is given by

Table 1 :
[31]ormanc7,8,9,and 10 of the six prediction methods.The prediction model based on FNN neural network is 4 layers and the number of membership functions is five.The number of iterations is 1000.The architecture of prediction model based on fuzzy wavelet neural network and genetic algorithm (FWNN-GA) is the same with the QPSO-FWNN method.The population size of GA is 100.The crossover type is one-point crossover, and crossover rate is 0.6.Mutation rate is 0.01 and the number of iterations is 500[30].The prediction model based on ARIMA model is built.The estimation of the model parameters is done using Maximum Likelihood Estimation and the best model is chosen as ARIMA (5, 1, 3)[31].Figures 6,7,8,9,and 10show that prediction results with BP, RBF, FNN, and FWNN-GA neural network and ARIMA model, respectively.The performance comparison of the four prediction methods is shown in Table1.From Table