Historical Feature Pattern Extraction Based Network Attack Situation Sensing Algorithm

The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.


Introduction
With attacks becoming more prevalent, the traditional static passive defense and whole system consolidation are hard to keep up with the changing rhythms, which have huge amounts of investment and affect the network performance. In this case, the dynamic, proactive, and targeted defending measures have been presented, most of which rely on attack situation forecast, that is, network attack situation sensing (NASS) [1,2]. NASS aims at forecasting future evolution trend of network attack situation based on historical features and current attack indications, guiding dynamic defense, and allowing administrators to take corresponding measures in advanced, and effective manner to quickly respond to the complex and ever-changing attack threats [3,4].
Rarely studying attack situation forecast, previous researches mostly using existing methods, such as autoregressive moving average model (ARMA), grey model (GM), and radial basis function neural network (RBFNN) [5][6][7][8]. ARMA identifies the dependence relationship and autocorrelation of situation sequences and establish mathematical prediction model [9]. It requests that situation sequences or their certain step difference satisfies the steady suppose, which is too strict to increase suitable scope. As one of GMs, GM(1,1) firstly weakens the randomness of situation sequences by using accumulation, secondly fits the born sequence through index curve, and then does regressive restitution after prediction, which can embody monotonously and slowly changing trend but hardly reflect some characteristics such as random rove and periodic fluctuation [10,11]. Grey Verhulst is suitable to describe the situation sequences with swing development according to "S" or anti-"S" form [12], and the method dividing the changing line into several stages does not lack rationality, but the difficulty is how to predict the occurrence moment and lasting time of each stage [13]. RBFNN utilizes the nonlinear characteristic to describe the regulation contained in situation sequences [14]. However, evolving regulation of attack situation is infinite and changeable; a practical type neural network with small scale cannot solve well [15,16].
Situation sequence contains massive complex and inconstant evolution trends, beyond the expression and prediction 2 The Scientific World Journal capability of traditional methods only by some formulas, functions or via some training [17,18]. Most traditional methods suffer from the confliction among training samples, rely on data preprocessing and artificial intervention heavily, do not support incremental training, and need to rebuild model once situation sequence changes [19][20][21]. Therefore, a situation prediction method based on historical feature pattern extraction (HFPE) is presented. The method measures the similarity between historical feature from the aspects of pattern and accuracy and utilizes multiple order difference operation to discriminate trends. It searches similar indications from recorded historical situation sequence, measures the link intensity of occurred indication upon subsequent effect, and infers the recurrence possibilities of some effects according to current indication. An evolution algorithm is introduced to measure prediction deviation and improve the adaptability of prediction algorithm continuously via gradual fine-tuning.
This paper proceeds as follows: Section 2 discusses algorithm principle for HFPE. Section 3 clarifies algorithm establishment and analysis. Section 4 presents the experiment results and Section 5 concludes the paper.

Basic Definition.
Looking from mathematical form, the continuous time-varied curve, = ( ), is commonly applied to describe the evolving process of attack situation. This curve is carried out by computer through sampling method, that is, to sample situation values with time interval , and then obtains discrete time sequences composed by ( , ), where represents the situation value at moment . To facilitate the research, a basic definition is made as follows: let ( , ) be the segmental subimage with neighboring segments from moment , let be the segmental gradient, let ( , ) be the gradient sequence, let ( , +1 , . . . , + −1 ), ( , ) be the characteristic spectrum of ( , ), let be zero vector; then (1) For the th component product of ( , ), + , the angle of inclination, + , can be defined as The stretch rate from ( , ) to ( , ) can be calculated by ( , , ), which is defined as [ , , ] is utilized to adjust the stretch rate, where is the prediction steps.

Prediction Principle.
Looking from probability theory and statistics, similar situation curve shapes are more probably derived from similar origin, mechanism, and impact, subsequently resulting in a similar subsequent effect. From the point of view of statistics, when the precedence relations of sequences in time appear frequently, it usually meant that the logical causal relationship exists in a certain degree.
It is supposed that ( , ) and ( , ) are known historical feature subpatterns, from the same pattern, < , and the further trend after > + is unknown and needed to be predicted. If ( , ) is similar with ( , ), then it can be deduced that the origin, mechanism, and impact in [ , + ) are similar with those in [ , + ), and the history after + may be repeated after + with some differences. According to this principle, the slope of the line segment behind can be forecasted bŷ+ is utilized to control the predicting steps. When = 0, 1, 2, . . . , − 1, the trend prediction curve can be recurred by and̂+ + .

Measurement System
2.3.1. Fitting Degree. Firstly calculate the angle cosine similarity between slope vectors, secondly introduce more order difference operators to obtain the trend difference of qualitative change and quantitative change, and then acquire the narrowing fitting degree by the difference of similarity degree and trend difference.
The trend differences of qualitative change and quantitative change are denoted by 1 ( ) and 2 ( ), respectively, and the former of which stands for the pattern difference, The Scientific World Journal 3 and the latter stands for the accuracy difference. The above two parameters can be derived by Thus the composite trend, ⊥ ( ), can be defined by Let ∇ represent backward difference operator, and define and then the order differential recursive equation can be obtained by in which is a positive integer, and (9) meets Let ∇ ( , , ) denote the trend difference between the feature patterns ( , ) and ( , ); then The fitting degree function, ( , , ), can be defined by where the large value of ( , , ) represents a fine fitting, and for −1 < ( , , ) ≤ 1 and 0 < ( , , ) ≤ 2, it can be derived that The occurrence probability of ( , , ) > 0 may be 50% statistically, which is too big. Therefore, it is necessary to subtract the penalty term, ∇ ( , , ), and filter ( , , ) by the threshold (0 < < 1) to narrow the fitting degree.

Universality Degree.
Divide the attack situation subsequence into two parts, that is, occurred indication and subsequent effect; the values of the domination intensity of the former to the latter (or call link intensity between the two parts) may be high or low, some of which have a far-ranging representative, and some just have rare earth especially instance. If all the values are treated evenly, then the prediction accuracy will be affected seriously, so it is important to outstand inevitable link of the high intensity and weaken accidental link of the low intensity.
Let [ , , ] be the universality value of ( , + ) in the historical feature pattern (0, ), where max can be derived by The value of max will be updated with the change of [ , , ] and can be accessed directly without waiting to calculate. The universality value can be shined upon to universality degree in (0, − ] by function ( , , ), which is shown as follows: The larger value of universality degree reflects finer representativeness of ( , ) and its extension and more exact patterns predicted by ( , + ). Otherwise, ( , + ) is just a special example, and the prediction effect is worse.

Contrast Degree.
The predication results of situation are usually impacted by link intensity of several different weights. The function mechanism often changes; that is, sometimes they work with a community decision and sometimes with an individual domination. Therefore, it is necessary to trace and adjust between outstanding statistics effect and showing individual advantage. It is supposed that̃1,̃2, . . . ,̃are not normalized weights, which can be adjusted tõ1,̃2, . . . ,̃by sensitization index ( > 0). Then the standardized weight can be derived by and comparison degree / can be obtained by =̃.

Algorithm Establishment and Analysis
As shown in the figure, the preparation part circularly promotes the sliding window ( , ), selects poor values of fitting degree ( , ) to reject, and sensitizes the product of universality degree ( , , ) and fitting degree ( , , ), which is assigned to . The prediction part first checks whether historical feature pattern set has record. What calls for special attentions is that the value of ( , ) in the sliding window or the fitting degree value of it with ( , ) in the occurred indication cannot be too small, because the smaller the above value, the poorer the contribution to prediction valuê+ .

Evolution Algorithm.
Evolution algorithm is introduced to measure predicting deviation from the views of pattern and accuracy, which can be fine-tuned to raise the adaptability of prediction algorithm.
The accuracy of adjusting to × can be derived by which is based on current weight set and (18) As shown in Figure 2, the evolution algorithm carries on the variables and results of the prediction algorithm and works to promote adaptability after acquiring measured value. Δ is an adjustment variable for and meets − −1 ≤ Δ ≤ −1 . If rises or the distance between | | and ln drops, then the adjustment amplitude becomes lower, else becomes higher. If | | < ln , then decrease the threshold to soften the terms, else increase the threshold. Δ is calculated  , then the prediction according to ( + , ) is accurate, and the value raising can be large. To determine , select the best one among value lowering by 5%, current value, and value rising by 5%, and restrict it by a reasonable range to prevent passivating or sharpening. Figure 3 gives an example of pre-dictinĝ (13,2) according to the historical feature pattern  Table 1. When = 5, the slope becomes larger at = 7 and smaller at = 12, which are reflected through ∇ 1 7 > 0 and ∇ 1 12 < 0 derived by (9), and the relative penalty value is recorded by ∇ (5, 10, 3). And through normalization process, the elements of set are   and so forth, and step trend can be predicted. It can be seen that this scheme has the ability to identify multiple long-range correlation contained in the same situation sequence.

6
The Scientific World Journal This part is analyzed according to evolution algorithm. Assuming that 13 = −0.86 and 14 = 0.00, so | | = 2, which is smaller than ln 13; thus, the value of needs to be lower, and once | | > 2, then raise the value of . It can be known that the changing value of universality degree [0, 3, 2] is 1.00 × 1.00, and raising this degree can strengthen the role of vector (0, 5). The changing value of [5,3,2] Table 2 shows that (5, 3, 2) becomes smaller, and becomes larger with continued evolution, which results in rapid rise of 0 / 5 , and approach between prediction value and measured value.
With the passage of time, and keep unchanged, grows linearly, and the algorithm can delete stale data, save recent data, and correct fitting threshold and universality degree. The above process can be complicated not only by autonomous evolution, but also by artificially modified parameters.

Experiment Results Analysis
The traditional indexes utilized to measure the prediction accuracy include mean absolute error (MAE), standard deviation error (SDE), and mean absolute percentage error (MAPE) [21] derived by This section selects MAPE to obtain the relative error between prediction pattern and measured pattern, which is denoted by . The standard deviation of relative error components is denoted by std . Figure 4 is a critical subsequence selected from actual network attack situation records, which includes various features such as ascent trend, saturation trend, decline trend, periodic fluctuation, and stochastic disturbance.  From the view of the experimental prediction results, the relative errors of HFPE, ARMA, GM(1,1), and RBFNN are 3.28%, 5.89%, 7.18%, and 16.11%, respectively. As shown in Figure 5, in the experiment, ARMA, GM(1,1), and RBFNN need to be artificially identified and protected against cyclical situation fluctuations. The difference transformation utilizes 12 as the distance and is restored after prediction to prevent poor prediction effect; otherwise, the relative errors of GM(1,1) and RBFNN may reach 59.67% and 73.99%, respectively. However, the above method is special, cannot be spread for that data preprocessing of these algorithms does not exist in universal law. On the contrary, HFPE can maintain adaptation to complicated and changeable trends but does not need data preprocessing or artificial cognition.

Experiment 2.
This experiment is to randomly choose subsequences with similar parts, repeat 20 times, and then calculate the average value.
From the view of the experimental prediction results, the relative errors of HFPE, ARMA, GM(1,1), and RBFNN are 8.09%, 20.89%, 44.89%, and 34.75%, respectively. If the situation sequence selected does not exist in any principle, then the relative errors will be 3.96%, 21.72%, 37.47%, and 53.54%, respectively. Figure 6 shows one group of data, in which = 42 is a boundary for historical feature pattern and prediction feature pattern.
If put all groups of the historical feature patterns into a new long sequence, and repeat above prediction, then the performance of ARMA and GM(1,1) drops rapidly, and that of HFPE does not change much for that longer sequence containing more correlation is benefit to prediction.

Experiment 3.
To compare differences among four algorithms, random data are utilized to simulate situation sequences. First, extract random data with bits from the entropy pool of Windows 7 system. Then randomly gather subsequence with 16 bits, the former 8 bits of which are occurred indication and the latter 8 bits are subsequent effect. Thirdly, splice occurred indication behind the random sequence to form a historical feature pattern and treat the subsequent effect as a prediction feature pattern. Let us make 100 groups of experiments to test each algorithm's capacity in resisting random interference and in identifying the correlation with far distance. The average results are listed in Table 3.
It can be found from the table data that HFPE has the best performance among the four algorithms. When the scale of experiment is large, this conclusion can be repeated well. And ARMA and RBFNN cannot deal with the random sequences with long bits, while HFPE can perform smoothly.

Conclusion
This paper proposes a prediction method based on historical feature pattern, that is, HFPE. The main principle of this algorithm is shown as follows. Fitting degree is introduced to measure the similarity among subsequences from the views of pattern and accuracy. Universality degree is utilized to test the representation of subsequence and its epitaxy. Contrast of the weight system is adjusted by sensitized index, which gives prominence to statistical effect in passivation