A Novel Algorithm for Intrusion Detection Based on RASL Model Checking

The interval temporal logic (ITL) model checking (MC) technique enhances the power of intrusion detection systems (IDSs) to detect concurrent attacks due to the strong expressive power of ITL. However, an ITL formula suffers from difficulty in the description of the time constraints between different actions in the same attack. To address this problem, we formalize a novel real-time interval temporal logic—real-time attack signature logic (RASL). Based on such a new logic, we put forward a RASL model checking algorithm. Furthermore, we use RASL formulas to describe attack signatures and employ discrete timed automata to create an audit log. As a result, RASLmodel checking algorithm can be used to automatically verify whether the automata satisfy the formulas, that is, whether the audit log coincides with the attack signatures. The simulation experiments show that the new approach effectively enhances the detection power of the MC-based intrusion detection methods for a number of telnet attacks, p-trace attacks, and the other sixteen types of attacks. And these experiments indicate that the new algorithm can find several types of real-time attacks, whereas the existing MC-based intrusion detection approaches cannot do that.


Introduction
Intrusion detection (ID) is an important network security technique.ID can be divided into anomaly intrusion detection and misuse intrusion detection in terms of the different principles of ID.The former can find unknown types of attacks.However, false positives rate of anomaly intrusion detection is often very high.In contrast, a misuse intrusion detection system has a comparatively low false positives rate with regard to known types of attacks.This is due to the principle of misuse intrusion detection: IDS developers predefine their known types of attacks, use appropriate language to describe these types, and establish libraries of attack patterns (called misuse signatures).The system will monitor the audit log.Once a data stream in the log is found to match with certain attack type, it means that an attack is found.
However, such a class of detection methods based on pattern matching (PM) suffers from their inherent problems.
First, affected by intruders' subjective wishes or other random factors, the logical relationship among its atomic actions associated with attacks of the same pattern launched by different intruders may present different features [1,2], where an atomic action means a minimum operation step in an attack.It is hard to depict precisely so vastly different attacks with a relatively small-scale attack pattern library.Second, a large-scale coordinated attack requires an intrusion detection algorithm to handle a large volume of network data in a short period of time.To address these issues, a series of intrusion detection methods based on model checking have been developed.
A relatively comprehensive algorithm has been presented, and it is based on linear temporal logic (LTL) model checking [1].Its basic principle can be formulated as follows: (1) use an LTL formula to describe an attack pattern as well as an automaton to record what happened in the audit log, and (2) use a model checking algorithm to check whether the automaton satisfies the formula (i.e., whether the records in log match the attack pattern).Since current model checking algorithms have been able to check up to 10 120 states, they are particularly suitable for the large-scale attack detection [3], and the operators in LTL formulas can flexibly describe various logical relationships between atomic attack actions.
Compared with the PM-based approaches, the MCbased ones can effectively portray the ever-changing attack patterns [1,3].Furthermore, the MC-based approaches have an important advantage for intrusion detection over the PMbased ones.Pattern matching is usually applicable to detect inconsistencies between data while automata, temporal logic formulas, and model checking techniques are applicable to detect inconsistencies of behaviors.Thus, the MC-based methods can do something more than the PM-based ones since intrusion attacks involve complex behaviors besides the comparatively simple data.
However, the algorithm in [1] can realize the automatic detection for neither concurrent attacks nor real-time (i.e., time constraint relation) attacks because LTL formulas cannot be used to describe multiprocess activities or time constraint relationships between attack actions or attack action sequences.As the first attempt to address these issues, a method based on ITL model checking was presented in [2], and it can describe and detect concurrent attacks, since ITL has more power than LTL.However, ITL-model-checkingbased methods still cannot describe and detect real-time attacks.For example, there are a large number of attacks with the following characteristic in a real network intrusion: No more than n seconds after action (sequence/process) A occurs, action (sequence/process) B occurs.Here, the condition "no more than" can be replaced by more than, less than, no less than, or equal to.The existing MC-based algorithms cannot find these attacks.
Therefore, motivated by addressing both concurrent attacks and real-time attacks simultaneously, we, in this paper, present a new interval temporal logic to describe conveniently the real-time attack signatures and also put forward a new MC-based approach to automatically detect the various changing modes of real-time attacks.
We conducted some simulation experiments and a benchmark test (see Section 7).The detection of several groups of attacks, such as telnet attacks and p-trace attacks, is simulated on MATLAB.The experiment results verify that (1) the new algorithm finds more attacks than the existing MC-based algorithms; (2) the new algorithm finds real-time attacks.This is the main contribution of this paper.
The remainder of this paper is organized as follows.Section 2 illustrates some related works and compares them with the new approach.Section 3 defines a new logic, RASL, and gives its formal syntax and semantics.Section 4 uses RASL formulas to establish some models for attack patterns.Several examples of models are given in Section 5. Section 6 formalizes a RASL model checking algorithm based on a new data structure called timed normal form graph (TNFG), and a misuse intrusion detection algorithm is presented.Section 7 presents several groups of experiments and compares the new algorithm with the existing ones with regard to the description capabilities and detection capabilities for intrusion attacks.Section 8 draws the conclusions of this paper.

Detect Various Attack Types Using Model Checking Linear
Temporal Logics.A tool called ORCHIDS was developed [3], which fulfilled the LTL-model-checking-based method for intrusion detection in reality [1].In one experiment, ORCHIDS found some p-trace attacks [4] which usually exploit the flaws in process calls to inject malicious code.It is difficult for traditional intrusion detection systems to find this type of attacks because they only match individual events [4].The ORCHIDS was improved in [5].In a real environment, it successfully detected a series of wireless network attacks [5], including deauthentication flooding, rogue access points, and Chop-Chop.This is the first IDS to successfully detect Chop-Chop attacks [5].Furthermore, to avoid repeated verifications needed by the algorithm in [1], an improved algorithm was put forward in [6], which is able to compute the number of guesses in password attacks.
Compared with the methods mentioned above, the new algorithm can be used to detect complex concurrent attacks and real-time attacks (See Section 7).
There are some studies that use interval temporal logics to describe attack patterns so that more intrusion behaviors can be expressed [15][16][17].However, these papers do not mention how to detect these attacks automatically.The method presented in [2] can do it automatically, but it can Figure 2: Different relationship between behaviors in an attack .only find concurrent attacks rather than real-time attacks.
In contrast, as a real-time interval logic, RASL has more expressive power (see Figure 1(b)), which can be used to describe the time relationships among attack activities, and our model checking algorithm can find real-time intrusion attacks in a fully automatic manner (See Section 7).

RASL
Definitions 1 and 2 give the formal description of the syntax of RASL, whereas the other definitions present its semantics.
Compared with ITL [11,18,19], the additional operator denoted as ";  " in RASL is appended for the description of time constraints between intervals.

Construct Signatures with RASL Formulas
We can use RASL formulas to construct signatures, that is, specifications of attack patterns.Compared with linear temporal logic, RASL has been additionally equipped with interval semantics.So, a phase, that is, a sequence of atomic actions, in an attack can be described with an interval in a RASL formula, while various steps in the phase can be described with various points in the interval [2].Temporal relationship between steps in an attack can be described with temporal operators.Logical relationship between various phases can be described with operator ";" [2].And a concurrent attack can be described with a formula with the operator "‖".Compared with ITL, RASL can express more.Particularly, repeated attacks can be described with operator " * " or "Θ", and a time constraint between phases or steps in an attack can be described with operator ";  ".Table 1 presents how to construct formal models for intrusion attacks with RASL formulas.And Figure 2 illustrates sequential relationships, concurrent relationships, and time relationships between behaviors in an attack.Definition 8 (See [1]).A record in a log library is modeled by a finite state automaton .

Theorem 9. A record of a log can be modeled by a timed automaton A 󸀠 .
Proof.According to Definition 8, we know that a record of a log can be modeled by a finite state automaton .For every transition  of , we add time constraint "true".For every state  of , we extend  to (, ), where  denotes absolute time.So, finite state automaton  is turned to timed automaton   .The theorem holds.

A Case Study
As a case study, we discuss several examples to show the expressive capability of the above proposed models.
Example 10.Password cracking inconsecutive attack: failure.The RASL formula is where connect means that an intruder is trying to connect.The intruder could launch another concurrent process before the end of current connection process.Thus, the subinterval that describes current execution of the concurrent connection process is over, and it can be described with operators before .The sub-interval that describes the result is over while this connection process fails, and it can be described with the operator  after .The intruder repeatedly tries connection, and it can be described with " * ".Inconsecutive phenomenon between connections can be described with "; ".
Example 11.Password cracking inconsecutive attack: success after connection failed  − 1 times.At first, one time failure in connection can be described as   := ( ∧   ); .And, then, a successful trial can be described as   :=  ∧   s.
The formula that describes the attack can be defined as   As shown in Figure 3, the definition of    is illustrated, where  = 4, that is,    denotes three times failures in connections.As shown in (a), (b), and (c) of Figure 3, there are three cases on the length of interval ( 1 ,  2 ,  3 )   0 in RASL formula.In each of the  − 1 failures in connections, there exists a one-to-one map between attack actions and their results.That is to say, the number of   s which describe attack actions is equal to the number of (fail)s that describe their results.This number is three, so only (c) of Figure 3 is correct.To this end, we can append atomic proposition  to the formula, and let  follow   Θ .Furthermore, we can append atomic proposition  to the formula when subinterval  * is over.The number of   s is equal to the number of (fail)s if ◻( ↔ ) holds, as shown in (c) of Figure 3.
Subinterval  * is executed repeatedly  − 1 times to guarantee  − 1 times cycles of   , as shown in Figure 4. We need two states in current subinterval  to make sure that the first state of the next subinterval  is the next state of the final state of the current subinterval.So, we replace  * with ( ∧ ) * .
Example 12. Phases of a telnet attack are observed as follows.Phase 1: the telnet service is started, and it is described as atomic formula .
Phase 2: the intruder closes firewall.There are three steps in this phase.At first, the intruder accesses C: \windows in order to find program .It is described as RASL atomic formula  1 .And, then, the intruder executes command -[PID] and monitors all processes in order to find PID of firewall process.It is described as RASL atomic formula  2 .At last, the intruder executes command -[PID] to close firewall.It is described as RASL atomic formula  3 .The intruder performs the three steps of this phase in sequence with a gap between each step.Each of the two delays is less than  seconds, and it is described as ;  1 ∈[0,) .
Phase 3: in order to login the system again in the future, the intruder makes a backdoor.There are two steps in this phase.The first step is to access directory in which file instsrv.exeexists, and step 2 is to execute command V.- : \ \ 32 \  ln V. in order to setup  service which is a backdoor.The former can be denoted as a RASL atomic formula  1 , and the later can be denoted as a RASL atomic formula  2 .The intruder performs the two steps of this phase in sequence with a gap between each step.The delay is less than m seconds, and it is described as In summary, the timed formula for the telnet attack is formulized as follows: In Formula (3), ";" is used to express a piecewise action, "‖" is used to express a concurrent action, and ";  " is used to express a time constraint relationship.

RASL Model Checking Algorithm and Intrusion Detection Algorithm
We can give a subset of RASL called ASL, which is obtained by deleting all of the time constraints in RASL.Reference [18] gives a data structure called normal form graph (NFG) as well as a procedure called PRO(P) to construct the NFG model denoted as   for an ASL formula .Thus, an ASL model checking algorithm was obtained in [18].Based on this work, we can obtain a RASL model checking algorithm and its intrusion detection algorithm.First, Definition 13 presents a data structure called TNFG, which is a timed version of NFG.(1) Build the TNFG of P,  = (CL(P); EL(P), X), by algorithm TNFG(P); (2) Obtain the timed automaton,  = (Σ, ,  0 , , , , ), by algorithm CONSTRUCT(  ).
Figure 6(a).The simulation results indicate that the model checking technique itself cannot make an IDS stronger, but this technique, when employing a stronger temporal logic, such as RASL, to describe attacks, can.We simulate and detect some p-trace attacks by using MATLAB.We randomly produce 30 kinds of p-trace attacks, and repeat 100 times for each of these attacks.On average, less than 10 kinds of attacks are reported by the LTL-based simulator, whereas almost 100 percent of kinds of attacks are found by the RASL-based simulator, as shown in Figure 6(b).The results indicate that the RASL-based algorithm enhances the detection power for p-trace attacks, compared with the LTL-based algorithm.Clearly, this is due to the stronger expressive power of RASL.
Suppose that the standard time unit is a second; Figure 7 illustrates a comparison between the ITL-model-checkingbased approach in [2] and our RASL-model-checking-based algorithm.We randomly produce some attacks including real-time attacks and non-real-time attacks.Compared with the ITL-based simulator, the RASL-based simulator raises the average number of detected attacks by as high as 400%, where the average time distance (or time constraints) between two atomic actions in the same real-time attack is only five seconds.The average number will still be raised by 15% even in the worst case, that is, the time distance is more than three thousand seconds.These results indicate that the RASL-based algorithm further raises the power of detection for p-trace attacks, compared with the ITL-based algorithm, again, due to the stronger expressive power of RASL.
In order to give a comparison of the detection ability for more types of attacks between the ITL-model-checkingbased approach [2] and the RASL-model-checking-based one, we tried to conduct a Benchmark test on KDD CUP 99 [20].We used a behavior version of a sample subset of this standard benchmark set [20] to evaluate our research in intrusion detection.Attacks fall into four main categories [20], that is, DOS, R2L, U2R, and Probe, including totally twenty-two types of attacks, as shown in Figures 8,9,10,and 11.In each of these four figures, the -axis means the ratio between the number of attacks found by ITL-based simulator and the number of attacks found by RASL-based simulator, whereas the -axis means different types of attacks.
As shown in the figures, all of the ratios range between 0 and 1.For some types of attacks, such as perl and ftp write, et al. the ITL-based simulator finds equal number of attacks when the new simulator does.And for other types of attacks, such as back, Neptune, and smurf, et al., the ITLbased simulator almost does nothing, whereas the RASLbased one does more.This is due to the strong expressive power of RASL again.

Conclusions
This paper defined a new real-time interval temporal logic-RASL.Based on it, we presented a RASL model checking algorithm and its intrusion detection algorithm.This enables us to employee MC-based approaches for detecting real-time attacks.P-trace attacks especially are hard to be detected by the existing IDS [4] except the LTL-based algorithm [1,3], the ITL-based algorithm [2], and the new RASL algorithm.The new algorithm has detected some real-time p-trace attacks in our simulation experiments.To the best of our knowledge, this is the only method to report this type of attacks.It is the benefit of using the new approach.

Figure 1 :
Figure 1: (a) Some temporal logics and their classification.(b) Some logics and their expressive power.

Figure 3 :Figure 4 :
Figure 3: Three cases on different length of projection.

Figure 5 :
Figure 5: Algorithm for intrusion detection based on RASL model checking (the main idea of this paper).

Figure 6 :
Figure 6: Comparisons of average number of attacks found by different MC-based approaches (: the number of kinds of simulation attacks and : the different number of kinds of attacks found by the different simulators).(a) For telnet attacks.(b) For p-trace attacks.

Figure 7 :
Figure 7: A comparison of detection ability for p-trace attacks using different MC-based approaches (: the average time distance between atomic attack actions and : the average number of attacks found by ITL simulator/average number of attacks found by RASL simulator).

Table 3 :
A comparison of different MC-based approaches for detecting telnet attacks.

Table 4 :
Another comparison of different MC-based approaches for detecting telnet attacks.

Table 5 :
Comparison of different MC-based approaches for detecting password attacks.
Attack actions\detection results (if attacksare found) ITL [2] RASL LTL[1] Consecutive/inconsecutive attack: success after connection failed  − 1 times ( is a given large constant) Yes Yes No Every time distance between  attacks is less than  seconds ( is a given small constant) No Yes No