A Security Situation Prediction Algorithm Based on HMM in Mobile Network

1 School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China 2College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China 3Computer Science Department, University of California at Los Angeles, USA 4Fujian Key Laboratory of Automotive Electronics and Electric Drive (Fujian University of Technology), Fuzhou 350118, China 5Department of Computer Science and Information Engineering, Providence University, Taichung 43301, Taiwan


Introduction
With the rapid development and popularization of Internet technology, mobile network becomes an indispensable communication mean [1].Nevertheless, one of largest issues is security threat in mobile network [2][3][4].People has developed many protection techniques to cope with these threats, such as mobile firewall, intrusion detection, and virus killer.Though these techniques are active and useful for already occurred threats but cannot monitor the complete situation of the mobile network.In this case, prediction technique of mobile network security situation has emerged as a trend of network security monitoring, which can rapidly acquire, understand, and display the security elements and use them to predict the future situation [5].By analyzing the changes of mobile network security situation, network manager can predict the security situation and protect the network from illegal attacks.
In existing security situation prediction techniques, the warning is classified on the basis of the damage degree caused by network attacks and the risk level of the security vulnerability [6][7][8].In such a case, the threats in mobile network can only be evaluated by the feature of attacks.Other factors such as environment are not investigated.Snort [9] classified the priority of warning into three levels.The warning of priority 1 illustrates a threat with the highest level, usually including security vulnerability with high risk, such as buffer overflow.The risk level of security vulnerability is analyzed by professional institutes, such as CVE (Common Vulnerability Exposure) [10] and Bugtraq [11], and classified into high, medium, and low.The classification result can also be obtained from a more complete quantified evaluation mechanism of vulnerability risk, namely, CVSS (Common Vulnerability Scoring System) [12].
The threat level of mobile network attack is closely related to the running condition and the vulnerability of target computer system.A typical example is worm virus Code Red II targeting at Linux computer systems.It   vulnerability, since there is no similar vulnerability in Linux systems, and therefore, it is not a threat.Many researchers proposed to evaluate the threat level by cross correlation of the mobile network warning and vulnerability scanning tool.Kruegel et al. [13] realized correlation analysis of Snort warning and Nessus [14] (a vulnerability scanning tool) and used it to confirm the warning.The target vulnerability of the attack is compared to the scanned vulnerability result of the target computer in terms of the port, service, vulnerability ID, etc.The matching result is used to confirm or filter the warning.If the target vulnerability of the attack exists in the target computer, the probability of a successful attack is larger, as well higher the threat level.Otherwise, the threat level is low, and the warning may be a false alarm.The proposed method has fully considered the network running state for the attack, by better recognizing false alarm and real threats.Nevertheless, the drawback is that it cannot evaluate the threats not targeting at vulnerability, since the information of the target vulnerability should be stored.Besides, the instantaneity of the vulnerability scanning report should be also considered.
The topology of a mobile network is depicted in Figure 1."Mission" is the target service, port of the attack, which is determined by the manager.For example, HTTP service of 80 port on a Web server is a key Mission.The attack targeting at the Mission has the highest threat level.Porras et al. [15] calculated the probability of a successful attack using Bayesian network.Most of the SIM (Security Information Management) software provides a method to determine the warning level by making a security strategy.It is convenient for the manager to recognize the interesting attacks by custom rules and reducing the threat level of some of attacks.

Related Work
With the rapid development of mobile network, security situation prediction becomes an important network security technique, which plays a significant role in defending 802.11 wireless network threats.Due to its intrinsic features, the risk detection system can operate normally in wireless mobile network.There are increasing security threats of communication devices in mobile network, and therefore, it is urgent to evaluate and predict the security situation of mobile network.The network security report published by Chinese government in 2017 shows several security issues in mobile network.The Shadow Brokers leaked many 0-day vulnerabilities.There are serious security vulnerabilities in Bluetooth protocol, and a security research company in Internet-of-Thing (IoT) Armis has identified eight 0-day vulnerabilities in Bluetooth protocol.The company creates a group of network attack vectors for demonstration, where attackers can completely host the Bluetooth-supported network devices.Malicious attack software is propagated through these devices.A manin-the-middle (MITM) connection is created, which makes lots of home cameras be intruded by Trojan, due to the fact that many intelligent cameras have weak password or vulnerabilities, which turns the main reason of camera intrusion.Security issues have arisen the concerns of government security departments in various countries, and a new generation of network security assessment prediction model emerges under this situation.The situation assessment of network communication security turned to be a hotspot in researches of network security.
In recent years, network intrusion detection systems have been replaced by security situation prediction techniques.Markov model has the nonaftereffect property and only related to the probability of its previous state.It is suitable to predict the random process which largely varies, as the future state is predicted by using the transfer probability matrix between states.The calculation of the transfer probability matrix uses statistical method and the prediction is to use the frequency to approximate the probability function.In the security prediction techniques for mobile network, Shake et al. proposed a network vulnerability evaluation method based on the electric current theory [16].Bass created a network situation framework using data fusion of multiple sensors.The network security situation is evaluated by deriving the identity of intruder, speed, threat level, and intrusion object [17].There are various network attacks, which pretend to be normal data packet and perform attacks on the target computer, despite more difficult to detect the attacks.So, most of network attacks cannot be detected only be matching the features of data packet format.Meanwhile, the current detection techniques should unpack the packet, causing detection delay.
The intrusion techniques have been rapidly developed to cope with the increasing attacks in mobile network.To improve the detection accuracy, the commercial network intrusion detection system [18,19] abstracted the characteristics of network attack using multiple algorithm and created a misuse detection model of network attacks.The network intrusion detection system based on misuse detection has high detection efficiency and low probability of false alarm.Since the detection system requires a real-time updated attack characteristic library, the characteristics of most network attacks are stored.If a new attack emerges, the detection system cannot detect it successfully since there is no information about the attack in the library.Therefore, the misuse-based detection model has higher false negative rate.
There are many situation awareness models for mobile network.Endsley [20] is the most concerned model that defines situation awareness as perception, by including the current state and future trend of the element in a space during a time period.Endsley model divides situation awareness into three parts: (1) Perception: it acquires and collects important information in network, which is the fundamental step of situation awareness model, (2) Comprehension: the collected data is integrated and analyzed and then used to evaluate the current situation, and (3) Prediction: the future trend is predicted on the basis of the results in (1) and (2).The data is collected from the management device, monitoring device and security device in the network.The collected original data is further integrated and processed to evaluate the current network security situation.On this basis, the future situation can be also predicted.
Due to its good performance in statistics, HMM (Hidden Markov Model) [21][22][23][24][25] technique is rapidly developed and applied in fields as voice recognition, classification, security situation prediction, intrusion detection, etc.In the field of security situation prediction, Hisham [26] proposed the first HMM model with finite state to predict the multistep attack in cloud computing system.The probability of potential attack can be calculated through an adaptive risk model, whereas such potential attack can be found through the autonomous cloud intrusion detection systems.In this case, the countermeasure can be adopted in advance.This method successfully predicts the alarm on the dataset DARPA2000 LLDDOS1.0.Luktarhan [27] proposed a HMM-based false alarm filtering algorithm that can detect the multistep attack and achieve higher detection accuracy rate.Experiments of this algorithm on DARPA2000 show good performance.As mentioned, the HMM technique usually uses Viterbi algorithm although it is applied to predict the hidden state on the basis of the observed state.Additionally, the precondition of multistep attack prediction should know all steps of multistep attack.The prediction is the upcoming step, yet not the future network security situation.
Multiscale entropy is a nonlinear characteristic analysis method.In 2002, Costa et al. [28,29] proposed the theory of multiscale entropy on the basis of sample entropy and applied it in HRV research.Results show that the multiscale entropy can give a more reasonable explanation on the differences of CHF and AF between disease and health than the sample entropy.MSE (Mean Square Error) and RMSE (Root Mean Square Error) are metrics to evaluate the complexity and irregularity of time series with different scale factors.It is used in classification of physiological time series [30], eccentricity fault detection of motor spindle [31], dynamical change of crude oil price [32], and network traffic analysis and anomaly detection [33].In this work, the multiscale entropy with different scale factors is used to select the scale information to process the alarm time series.HMM is used to train the security state parameters, whereas the prediction is realized by weighting the value of autocorrelation coefficient between security situations.
The remaining of this paper is organized as follows.Section 2 introduces the multiscale entropy based Markov model; the details of the prediction algorithm are illustrated in Section 3. The experimental results and analysis are presented in Section 4, and finally, summary and future work are depicted in Section 5.

Multiscale Entropy Analysis.
In mobile network, the complexity of time sequence with limit length is usually measured by nonlinear characteristic analysis.The entropy values for different scale factors are used to select suitable scale and process next the time sequence, and the sequence is further trained by the HMM.The acquired data is one-dimensional alarm time sequence.The multiscale entropy not only can be used to characterize the complexity of the one-dimensional sequence, but also can display the detailed characteristics of one-dimensional alarm sequence from different scales.Therefore, it can be used to analyze the complexity of onedimensional alarm sequence.For the alarm time sequence, N is the length of the sequence, and the calculation process of the multiscale entropy is described in this section.
Firstly, the coarse-grained sequence is generated.The original sequence is divided into nonoverlapping window with length s.The sequence values in the window are averaged to obtain the coarse grain sequence {y  } on the s scale.Each element of sequence    can be obtained via the following formula: The length of each coarse-grained sequence is the value divided by the original length s.Here, the time sequence of s = 1 is the original sequence.Secondly, the sample entropy is calculated.For coarsegrained sequence {y(1), y(2), . . ., y(M)}, M is the length of the Wireless Communications and Mobile Computing sequence.Detailed steps to calculate the sample entropy is described next.
Step 1.The given model dimension is m, and the m dimensional vector is composed by the original sequence.
Step 3. Given a threshold value r, for each  statistic value, we have the ratio of the number (, ) <  and the number of the total number N − m + 1 is denoted as B m i ().
where 1 ≤ j ≤ N − m, j ̸ = i, seeking its average value for all i.
Step 5.In theory, the sequence sample entropy is as As N takes a finite value, sample entropy for sequence of length n can be estimated by Step 6.With the sample entropy formula, each scale factor sequence can be calculated.A function with scale factor s as argument and with sample entropy as dependent variable can be created.Thus, it can be used to analyze the complex nature of the alarm time sequence.
It is obvious that the multiscale entropy is related to the scale factor s, the embedding dimension m, and the similarity coefficient r.In this paper, parameters m = 2 and r = 0.2 are set.
With the above method, multiscale entropy of onedimensional alarm time sequence can be calculated.This result is used in next-step data processing and brought in HMM for training.Finally, the corresponding scale state transition probability matrix can be obtained.

HMM Definition.
HMM is a statistical learning model in mobile network, which can describe the process of generating observed sequence.Developed on the basis of Markov chain, HMM is a double random process, including the transfer from state to state and the transfer from state to output.In this work, the transfer probability matrix between states is calculated by HMM.There is a Markov assumption of state transfer, of which the probability of state at time t shifting to state at time t + 1 is only related to the state at time t and not related to any other states.Nevertheless, the assumption is not reasonable in practice since this probability depends on the current state and the historical states.To address this issue, this work has improved the Markov assumption of the state transfer probability in traditional HMM model.A new model is realized by weighting and used to predict the future security situation of the network.
The HMM includes two random processes, transition between states and transition between state to output.Markov chain prediction for state transition at hidden layer utilizes the idea of weighted Markov chain prediction in [34].Namely, the weight probability distribution by using multistep situation transition probability is used to predict the system's next status.The correlation coefficient in each order reflects the strength of relationship between time sequences with different step.Analysis between transition probability and time sequence makes full use of known conditions for prediction.On this basis, the proposed prediction model is described as follows.
Step 1. Calculate correlation coefficients in various steps: where k represents k-order (with step length of k) correlation coefficient, x t represents the security situation at time t, x is the mean value of the security situation, and n is sequence length of security situation value.
Step 2. Normalize the self-correlation coefficient of each order: k is regarded as the normalized weight of the Markov chain with various steps (M is the largest order in prediction).
Step 3. Establish security situation grading standards (to determine hidden state space).The security situation is divided into four states [35]: (i) Good (G) indicates the system is in good condition with no attack.
(ii) Probed (P) is in the state of being attacked.Under this state, the availability of a host is slightly decreased, and the probability of being attacked increases.
(iii) Attacked (A) indicates that the host has been attacked and the probability of being invaded increases.(iv) Compromised (C) shows that the host has been invaded and is in a dangerous state.The confidentiality, integrity and availability of the host are destroyed.
The four states are, respectively, denoted by G, P, A, and C, so S = {G, P, A, C}.The transition between the host states constitutes a complete Markov chains, as shown in Figure 2.
Step 4. According to the security situation grade and alarm information, we sample security situation with different step lengths via HMM.Samples are trained to generate the transition probability matrix of different steps, determining as well the probability laws of security situation transition.
Step 5. On the basis of current state of the security situation, we can predict the probability of security status in the next time by using the security state transition matrix.
Step 6. Use the weight of prediction probability with different steps for a state as probability of secure state appeared in the next moment.
Step 7.With the obtained value of security state, the network situation can be analyzed.
Step 8. Repeat above Steps 4 to 7 to calculate the situation result under the scale factor and perform comparison to verify the proposed algorithm.Step 1. DARPA2000 dataset is processed by Snort, yielding original alarm time sequence.

Algorithm Implementation
Step 2. Time sequence is processed by multiscale entropy method.It generates the entropy value of different scale data.Two data sets with the maximum and the minimum entropy values are selected as the input of the next step.
Step 3. Selected data sets are used to generate the multiorder state sequence matrix, which are regarded as the parameter to train the original data.
Step 4. Multiorder alarm time sequence matrix is trained using HMM, yielding the multiorder state transition probability matrix.
Step 5.The weighting algorithm is used to calculate the autocorrelation coefficient of each order of the time sequence.
Step 6.The weighted prediction algorithm is used to calculate the network security state at next time for each scale factor.
Step 7. The prediction result is used as the input the prediction iteration to predict the next security state.
Step 8.The difference between the prediction result and the actual value is evaluated by the MSE.Besides, the prediction results for different scale factors are also compared.

Data Preprocessing.
In the mobile network, the DARPA2000 dataset is collected using Snort.The alarm type is selected as the original time sequence.Based on the calculation results of multiscale entropy, nine data sets with the scale factors from 2 to 10 are used in the experiments.For each scale factor, we divide the original data on the basis of scale, yielding the multiorder data matrixes.These matrixes If (the ratio of srcIP > ) (9) Return <signature, srcIP, any> (10) Endfor (11) Foreach dstIP in D (12) If (the ratio of dstIP > ) (13) Return <signature,any, srcIP> (14) Endfor ( 15) endif (16) Endfor } Algorithm 1: The extraction of alarm stream.will be further trained by HMM, and the corresponding state transition probability matrixes are generated next.

Implementation of Prediction Algorithm.
The distribution of various attributes of alarm in mobile network is not uniform.By analysis, there are lots of alarm streams generated by a few attributes.The alarm stream is represented by triple <Signature, src IP, dst IP>, where "Signature" is the rule to generate the alarm and "src IP" and "dst IP" are, respectively, original IP address and the destination IP address.First, the one-day alarms of intrusion detection system in mobile network are collected, as the ratio threshold  of "Signature" to generate the alarm and the ratio threshold  of IP to generate alarm are determined.If the ratio threshold of "Signature" exceeds the value of , the corresponding src IP and dst IP should be recorded.Moreover, if the ratio threshold of IP to generate alarm exceeds the value of , the triple will be further analyzed as shown in Algorithm 1.
The returned triple in algorithm will be recorded each hour.The alarm sequence of a certain triple can be denoted by a random process {x(t) : t ∈ N}, where x(t) is the number of alarms generated by the triple within t hours.The period of the sequence will be further analyzed with Algorithm 2.

Results and Discussion
5.1.Dataset.DARPA2000 is an offline intrusion detection dataset, available and maintained by MIT's Lincoln Laboratory [36].It has become a standard dataset to verify effectiveness of intrusion alert correlation algorithm, scene construction algorithm, and trend prediction algorithm.
DARPA2000 dataset includes all monitored data package in demilitarized zone (DMZ) and the internal (INSIDE) by using TCPDUMP.The dataset contains instances in attack scenarios of LLDOS 1.0 and LLDOS2.0.2.In LLDOS1.0 scenario, the attacker captures and controls 3 hosts in Eyrie airbase via vulnerability of Solaris Sadmind, uploads Mstream tool, and performs DDOS attack on a government website.Different with LLDOS 1.0, LLDOS2.0.2 utilizes a more concealed method to seek host with vulnerability and setup Mstream.
In this research, we make use of DARPA2000 dataset to verify the effectiveness of the proposed algorithm.It contains two multistep DDOS attack scenarios.There are lots of data, redundancies, and false alarms in both scenarios.To verify the effectiveness of the method, we just take attack alarm data in five steps to complete the experiments: the attack scene time is about three hours to LLDDOS1.0 and about 1.5 hours for LLDDOS2.0.

Calculation of Multiscale Entropy.
With corresponding formula, the original alarm time sequence can be calculated, and the scale factor is selected from 2 to 20.The histogram of entropy values is depicted in Figure 3.
In this figure, samples of entropy values change with the scale factor, where the minimum and the maximum values are obtained when the scale factors are 2 and 10, respectively.The value of entropy reflects complexity of time sequence and frequent changes of alarm data.Thus, the security situation can be easily reflected by entropy value.We assume that the entropy value has effect on prediction results, whose assumption will be further proven in experiments that follow next.

Weighted Prediction Performance.
After calculating the formulas ( 5) and ( 6), the autocorrelation coefficients of the training data are obtained and normalized, as shown in Table 1.
The normalized autocorrelation coefficients are calculated as the weighted values, and the results are compared with the calculated results.The maximum probability of state is used as the prediction result at the next time.
We added the multiscale entropy method on the basis of method introduced in [35].The overall changes of network security situation cannot be clearly observed due to dramatic changes, and therefore, we introduce Bezier curve to describe the overall trend of network security situation.Advantages using Bessel curve include smoothing the change curve of network security situation.The actual security situation prediction graph for LLDDOS2.0 is shown in Figure 4.
As depicted in this figure, the network security situation value is not high at the beginning.During this period, the network is attacked by Ipsweep scanning and the host is probed by Sadmind Ping.After 3:00 hours, the trend of value decreases during a period, which shows that the attacker has completely controlled the host.The rising trend reflects the attackers' launch DDOS attacks; the decline shown next reflects that the attack is gradually weakened.As a result, the trend curve is more vivid to describe the trend of the network security situation.It is helpful for administrators to manage the network security situation.After the preprocessing of the data, the multiorder state transition probability matrix is obtained.As listed in Tables 2 and 3 the state transition probability matrix when the scale factors are 2 and 10.
As the original data is preprocessed to generate the matrix of multiorder state transition probability with different scales,  the scale factor as 10 and the order as 4 are chosen for evaluation.The prediction results are listed in Table 4.
In Table 4, C is shown to be the secure state at the next time with the largest probability, representing that the host is intruded under a dangerous state.The confidentiality, integrity, and availability are damaged.In this case, corresponding measures should be adopted and result in the fact that Table 4 can be used as the input of the next iteration for future prediction.
The probability of each state at the next moment of time sequence is obtained through this proposed method.The value of the probability is used to determine the status of the security situation.Figure 5 shows the actual value and the predictive value of the different scale factors in the time period from 2:51 to 4:25 hours.The horizontal axis represents time and the vertical axis represents the security situation.The MSE is introduced to illustrate the different scaling factors of prediction and the difference between actual and predictive value.The multiscale entropy and the MSE values in one-step prediction are listed in Table 5.
The different scale factor corresponding to RMSE and MSE in one-step prediction is shown in Table 5.As the scale factor is 2, it gets smaller entropy and larger RMSE, and with the scale factor 10, the results are opposite ones.Therefore,  6 and 7 and Figures 6 and 7.
The prediction curves show the predictive values are close to actual values, what demonstrates the effectiveness of the proposed weighted HMM algorithm.At the same time, it is possible to note that the change on prediction steps will not affect the effectiveness of multiscale factor predicted results.

Conclusions
Current HMM-based prediction algorithms predict the unknown state via a known state.It has limitations and the prediction accuracy is not high enough.To address this issue,   this work presents the application of the weighted HMM model to predict the security situation of mobile networks, where the multiscale entropy method is used to select the appropriate scale factor of the data.It will be regarded as training data for HMM model to obtain the state transfer probability matrix, whilst the correlation coefficients are considered as weights to predict changes of network security situation.In the proposed model, we use the alarm time sequence as original data, and with the multiscale entropy method, an appropriate scale factor is selected as training data to train the parameters in HMM model.Finally, the predictive results are evaluated and compared to real security situation.Experimental results have shown that the proposed method is accurate and effective.Weighted method can not only use current network state, but also consider the historical security situation.The dependent relationship between the security situations is fully used.It could predict network security at the next state.The network security administrator could take measures to maintain network security and avoid more attacks in advance.
Based on the prediction issues in mobile network, several security situation prediction algorithms are analyzed as follows: (1) there is causal relationship between attack behaviors, (2) the probability to occur different attacks is different, (3) the evidence of future attacks includes the attack itself, (4) the purpose of attack can be inferred, and (5) the evidence of future attacks has relationship with the change of network security situation.These features provide a thought to predict the security situation by extracting evidence.Besides, other aspects can be also researched, such as real-time data extraction and data fusion of multisource heterogeneous sensors.

Figure 1 :
Figure 1: Topology structure of mobile network.

Figure 2 :
Figure 2: The state transition graph of Markov model.

Figure 3 :
Figure 3: Multiscale entropy histogram sequence of the original data of the alarm time.

Figure 4 :
Figure 4: The actual value of network security situation.
has extremely low threat level since it is designed on the basis of Windows IIS 2 Wireless Communications and Mobile Computing Therefore, the Markov model is suitable for security prediction of large data sample in mobile network communication.A typical Markov process includes Bernoulli process, Wiener process, and Poisson process.Based on continuous or discrete state and time parameter, Markov process can be classified into three categories: (1) discrete time discrete state Markov process, called Markov chain, (2) continuous time discrete state Markov process, called continuous time Markov chain, and (3) continuous time continuous state Markov process.
The multiorder state transition probability matrix is used to calculate the security states with different scale at next time.The next security state is iteratively predicted using the obtained result, and differences between prediction value and the actual value are evaluated by MSE and RMSE.Besides, the prediction results for difference scale factors are also compared.Comparison shows the effect of multiscale entropy on weighted HMM prediction, and steps are detailed next.
work.Changes of future security situation in mobile network are calculated by the weighting algorithm, of which entropy values of different scale data are obtained using the multiscale entropy.Two sets of data, respectively, with the maximum and the minimum entropy value which are processed to generate the multiorder state sequence matrixes are trained using HMM.

Table 1 :
The autocorrelation coefficients training data and the normalized results.

Table 2 :
The state transition probability matrix when scale factor =2.

Table 3 :
The state transition probability matrix when scale factor =10.

Table 4 :
Results of security situation prediction at next time.

Table 5 :
The multiscale factor and the MSE /RMSE values in one-step prediction.

Table 6 :
The multiscale factor and the MSE /RMSE values in two-step prediction.

Table 7 :
The multiscale factor and the MSE /RMSE values in three-step prediction.Weighted prediction method can be used to predict security state at the next moment.It can also predict the results of iterations to obtain the future network security conditions within period.The prediction results and root mean square error values with different scale factors and different steps are given in Tables