Application-Layer DDoS Attack Detection Using Explicit Duration Recurrent Network-Based Application-Layer Protocol Communication Models

Existing application-layer distributed denial of service (AL-DDoS) attack detection methods are mainly targeted at specifc attacks and cannot efectively detect other types of AL-DDoS attacks. Tis study presents an application-layer protocol communication model for AL-DDoS attack detection, based on the explicit duration recurrent network (EDRN). Te proposed method includes model training and AL-DDoS attack detection. In the AL-DDoS attack detection phase, the output of each observation sequence is updated in real time. Te observation sequences are based on application-layer protocol keywords and time intervals between adjacent protocol keywords. Protocol keywords are extracted based on their identifcation using regular expressions. Experiments are conducted using datasets collected from a real campus network and the CICDDoS2019 dataset. Te results of the experiments show that EDRN is superior to several popular recurrent neural networks in accuracy, F 1, recall, and loss values. Te proposed model achieves an accuracy of 0.996, F 1 of 0.992, recall of 0.993, and loss of 0.041 in detecting HTTP DDoS attacks on the CICDDoS2019 dataset. Te results further show that our model can efectively detect multiple types of AL-DDoS attacks. In a comparison test, the proposed method outperforms several state-of-the-art approaches.


Introduction
With the progress of increasingly advanced network infrastructure and network layer defense technologies, attackers increasingly turn to Internet-based applications as their attack targets, resulting in the continuous emergence of application-layer attacks [1,2]. Tese attacks are carried out using legitimate user requests and protocols at the application-layer. Terefore, the data fow of applicationlayer attacks at the network and transport layers is not signifcantly diferent from that generated by normal users.
Distributed denial of service (DDoS) attacks are one of the most dangerous attacks [3][4][5][6], especially applicationlayer DDoS (AL-DDoS) attacks, such as HTTP DDoS attacks [7,8] and SMTP food attacks [9]. Te HTTP DDoS attacks are usually implemented by a large number of bots sending a food of page requests to a web server at the same time, thus consuming server resources, such as database cycles, CPU cycles, or memory. In August 2022, Google encountered the largest HTTP DDoS attack in history, which tried to shut down Google's Cloud Armor customer service, with a peak of 46 million requests per second [10]. Te complexity of AL-DDoS attacks is also expected to grow over time.
Existing AL-DDoS attack detection methods are mainly targeted at specifc attacks but cannot efectively identify other types of AL-DDoS attacks. Terefore, to comprehensively detect AL-DDoS attacks, multiple detection methods need to be deployed in a network. However, the principles and parameter settings of each detection method are fundamentally diferent, which complicates network management. Moreover, deploying multiple AL-DDoS attack detection methods simultaneously will also lead to the degradation of network performance. Hence, it is necessary to design a detection method that can efectively detect various AL-DDoS attacks.
In this study, we reexamine the issue from the perspective of application-layer protocol communication. Te key idea is to model the communication of the applicationlayer protocol through an explicit duration recurrent network (EDRN) to detect AL-DDoS attacks, taking observed application-layer protocol keywords and time intervals between adjacent protocol keywords as inputs. Applicationlayer protocol keywords refer to custom request commands and server response status codes, which can refect the behavior of users when using the protocol.
Recurrent neural networks (RNNs) exploit cycles in network nodes to capture the dynamics of sequences, and they have been widely used in sequential data mining with outstanding performance results [11]. However, traditional RNNs have hidden states whose durations approximately follows a geometric or exponential distribution [12,13]. As a result, it is difcult to use traditional RNNs to model the variable durations of hidden states.
During the communication process of the applicationlayer protocol, the behavior of users and time intervals between adjacent protocol keywords are determined by many factors, such as the request method, network transmission delay, and response processing time of servers. Tus, the duration of hidden states under a sequence of application-layer protocol communication may follow a relatively complex distribution, and not necessarily a geometric or exponential distribution. Te EDRN is based on an extended hidden semi-Markov model (HSMM) and can describe hidden states of any duration distribution [14]. Tis study adopts EDRN to model application-layer protocol communication for AL-DDoS attack detection. To evaluate the model, experiments are conducted on the CICDDoS2019 dataset [15] and datasets collected in a real campus network. Te experimental results show that the EDRN is superior to several popular RNNs, and the EDRN-based model can efectively detect multiple types of AL-DDoS attacks.
Te main contributions of this study can be summarized as follows: (i) We proposed an EDRN-based application-layer protocol communication model. Te model uses EDRN to describe the communication process of the application-layer protocol and takes application-layer protocol keywords and time intervals between adjacent protocol keywords as inputs for the frst time.
(ii) Based on the application-layer protocol communication model, we proposed an attack detection method that detects AL-DDoS attacks in real time by monitoring the application-layer protocol keywords that are used in the process of protocol communication.
(iii) We compared several RNNs based on the CICD-DoS2019 dataset and a real campus network dataset, and the experimental results showed that the EDRN has the best performance. We also compared our proposed AL-DDoS attack detection method with several existing methods, and the test results confrmed the efectiveness and superiority of our method.
Te remainder of this paper is organized as follows: Section 2 reviews recent studies on AL-DDoS detection. In Section 3, we describe the model for application-layer protocol communication. Section 4 presents the proposed AL-DDoS attack detection method. Te experimental results are presented in Section 5 and discussed in Section 6. Section 7 concludes the paper.

Related Works
Te detection of AL-DDoS attacks has attracted the attention of researchers [16][17][18][19][20][21]. Existing methods are mainly targeted at specifc AL-DDoS attacks. For example, Xie and Yu [22] used HSMM, independent component analysis, and principal component analysis to mine web server logs to detect HTTP DDoS attacks. Wang et al. [23] used the Hellinger distance and sketch data structure to detect HTTP DDoS attacks. Zhou et al. [24] calculated the entropy of fash crowds and attacks for HTTP DDoS attack detection. Singh et al. [25] used four behavioral features and a support vector machine (SVM) to detect HTTP DDoS attacks. Praseed and Tilagam [26] used probabilistic timed automata (PTA) models to describe the behavior of legitimate users for HTTP DDoS attack detection. Lin et al. [27] used the rhythm matrix statistical model to capture the characteristics of user access trajectories to detect HTTP DDoS attacks. Zhao et al. [28] used URL access entropy to identify HTTP DDoS attacks. Praseed and Tilagam [29] used signatures based on HTTP request patterns to detect HTTP DDoS attacks. Raja Sree and Mary Saira Bhanu [30] used fuzzy bat clustering to analyze web server logs for HTTP DDoS attack detection in the cloud.
In terms of SMTP food attack detection, Tudosi et al. [31] analyzed the trafc of SMTP food attacks and used Snort (open source intrusion prevention system) to detect SMTP food attacks. Schneider et al. [32] used the statistical characteristics of attack fows to detect SMTP food attacks. Aziz and Okamura [33] adopted deep learning algorithms to detect SMTP food attacks on software-defned networking (SDN) platforms. Gurusamy and Msk [34] detected SMTP food attacks by monitoring all ports' trafc statistics in the SDN.
In addition, Kasim [35] used the convolutional neural network (CNN) and long short-term memory (LSTM) to detect DNS food attacks. Trejo et al. [36] used a visual platform and K-nearest neighbor (KNN) classifcation algorithm to detect DNS food attacks. Datta et al. [37] detected DNS food attacks by monitoring the DNS query per second in IoT networks. Bushart and Rossow [38] used an anomaly-based low-pass flter to detect DNS food attacks.
Existing methods are mainly targeted at specifc AL-DDoS attacks and do not consider the characteristics of application-layer protocol communication. In this study, we adopt EDRN to describe the communication process of the application-layer protocol, which can capture the suddenness, randomness, and volume of protocol communication, and then present an EDRN-based application-layer protocol communication model for AL-DDoS attack detection. Tis model can efectively detect multiple types of AL-DDoS attacks.

Application-Layer Protocol Communication Models
From the perspective of application-layer protocols, when using an application-layer protocol, user behavior over a period of time is refected in the application-layer protocol; that is, the interaction between a series of application-layer protocol keywords. Application-layer protocol keywords refer to custom request commands and server response status codes, which can refect the behavior of users when using the application-layer protocol. For example, HTTP protocol keywords include request commands "POST," "GET," and "HEAD," and server response status codes "100," "200," "304," and "404," while SMTP protocol keywords are composed of "MAIL FROM," "HELO," "RCPT TO," "VRFY," "QUIT," "REST," "DATA," "EXPN," "HELP," and "NOOP" and server response codes, such as "250" and "334."

Application-Layer Protocol Communication Process.
When regular users are using an application-layer protocol, the statistical characteristics of the protocol keywords and the time intervals between adjacent protocol keywords are quite diferent from those of AL-DDoS attacks. For example, when regular users are using the HTTP protocol, their speed of clicking pages, time taken to, and the process of browsing pages have certain stability. However, in the applicationlayer protocol keyword sequences generated by HTTP DDoS attacks, the protocol keyword "GET" appears very frequently, while other protocol keywords appear less frequently, and the time intervals between adjacent protocol keywords are small. Terefore, the application-layer protocol keywords and the time intervals between adjacent protocol keywords can be used as observations to describe the communication process of the application-layer protocol and enable the detection of AL-DDoS attacks. Figure 1 shows the communication process between users and a web server represented by a sequence of HTTP protocol keywords, wherein the HTTP protocol keyword sequence representing users' behavior is as follows: "GET," "POST," "200," "HEAD," "304," . . ., "200," and "GET."

Application-Layer Protocol Keyword Extraction.
We frst identify the application-layer protocol based on regular expressions, and then extract the protocol keywords. In this way, the number of protocol keywords to be matched each subsequent time can be reduced, thereby improving the speed of the protocol keyword extraction process. When identifying a TCP-based application-layer protocol, the frst few data packets of each TCP connection are cached, then the application-layer data of the data packets are reassembled, and fnally the protocol regular expression [39] is matched against the reassembled application-layer data. When identifying a UDP-based application-layer protocol, we use regular expressions to match the payload of each data packet. Te identifcation process of TCP-based applicationlayer protocols is shown in Figure 2. Tis method can identify application-layer protocols in real time.

Protocol Communication Modeling.
At the gateway of a network, we can obtain application-layer protocol keywords and their arrival times using the protocol keyword extraction method described in Section 3.2. Assuming that the application-layer protocol has W keywords, which can be digitized as: 1, 2, ..., W. When users are using the applicationlayer protocol, the communication process can be described as a series of observations thus: is the observation at the t th time that the protocol keyword arrives the gateway. Te value of I t is based on the protocol keyword and the time interval between adjacent protocol keywords arriving at the gateway; that is, is the digitized label of the t th protocol keyword arriving at the gateway, and i (2) t is expressed by equation (1). In equation (1), R t denotes the time the t th protocol keyword arrives the gateway, and R t−1 denotes the time the (t − 1) th protocol keyword arrives the gateway. In this study, the unit of time measurement is chosen as seconds; I 1 is the observation generated by the frst protocol keyword arriving at the gateway, where i (1) 1 is the digitized label of the frst protocol keyword arriving at the gateway, and i (2) When using the application-layer protocol, users' behaviors may change. For example, users may use the HTTP protocol for varied purposes, including browsing web pages, watching movies online, and shoping online. Terefore, the protocol keywords and time intervals between adjacent protocol keywords arriving at the gateway will change over time. Terefore, the durations of hidden states in the observation sequences of an application-layer protocol communication process may follow a relatively complex distribution.
We used the EDRN to model the communication process of the application-layer protocol. Te EDRN-based application-layer protocol communication model is shown in Figure 1, where x t is the next possible states' predicted probabilities at the (t + 1) th time and y t is the all possible states' probabilities at the t th time. Te unfolded unit structure of EDRN is presented in Figure 3, where tanh denotes the hyperbolic tangent function and σ denotes the sigmoid function. Z input , Z forget , Z output , and Z tan h denote input, forget, output, and tanh gates, respectively.
Assuming that the communication process of the application-layer protocol has K macrostates, and each macrostate has L substates. Z forget , Z input , Z tan h , and Z output are calculated using the following equations: Z output � σ y t (: L)A (4) + I t B (4) + b (4) .
Similar to LSTM, each unit fnally returns (x t and y t ) to the next unit. Te x t and y t are calculated using equations (6) and (7), respectively, where " * " represents the element-wise production as follows: . . . . .
x t y t

AL-DDoS Attack Detection
Te AL-DDoS detection method proposed in this study involves two phases. In the frst phase, we train the EDRNbased protocol communication model. In the second phase, every application-layer protocol communication process is monitored in real time. Once a protocol keyword arrives at the network gateway, the corresponding observation sequence I t 1 will be updated, where t denotes the number of observations. Ten, we calculate the output η using following equation: In equation (8), y t (v, τ) denotes the probability of p t under I t 1 and p t � (v, τ) denotes that τ(1 ≤ τ ≤ L) protocol keywords will appear in state v(1 ≤ v ≤ K). Te y t (v, τ) is defned and expressed as follows: In equation (10), a (k,l),(v,τ) denotes the interstate transition probability from p t � (k, l) to p t+1 � (v, τ) and is defned by following equation:.
In equation (10), ξ v,τ is the probability of p t � (v, τ) and defned by following equation: In equation (10), χ (1) v,τ (I t ) and χ (2) v,τ (I t ) are defned by equations (13) and (14), where Pr [I t |I t−1 1 ] is the scaling factor as follows: If η is larger than a predefned threshold, the network is considered as normal. Otherwise, we consider that there is an AL-DDoS attack related to this protocol in the network. Te detection architecture of our method is shown in Figure 4. Our method can detect AL-DDoS attacks in real time.

Evaluation
In this section, we test our proposed AL-DDoS attack detection method using multiple datasets to evaluate the detection performance against HTTP DDoS and SMTP food attacks.

HTTP Datasets.
At the gateway of the campus network, we collected the data generated by a large number of normal users when using the HTTP protocol. In addition, we adopted the method described in [16] and DDoS generators to generate three diferent types of HTTP DDoS attacks, namely, single-page, random-page, and top-fve-page HTTP DDoS attacks. A single-page HTTP DDoS attack targets a specifc page of a website, usually one that is frequently visited by users, while a random-page HTTP DDoS attack targets a random page from all potentially visited pages of a website. A top-fve-page HTTP DDoS attack targets the top fve most visited pages from a resource site. Subsequently, we extracted observation sequences from the collected data for training and testing. Te time length of each observation sequence was 60 seconds. Te HTTP datasets are summarized in Table 1.

SMTP Dataset.
Similar to HTTP data collection, we collected data generated by a large number of normal users when using the SMTP protocol. We adopted the method described in [18] to generate SMTP food attacks. After that, we extracted observation sequences. Te time length of each observation sequence was equally 60 seconds. Te SMTP dataset is summarized in Table 2.

CICDDoS2019 Dataset. Te CICDDoS2019 dataset is a public dataset developed by the Canadian Institute for
Network Security (CIC) in 2019 [15]. Tis dataset is one of the popular datasets and is widely used in the feld of DDoS detection. Te dataset contains 11 kinds of DDoS attacks, among which the AL-DDoS attack is HTTP DDoS attack. Te packet in the CICDDoS2019 dataset contains the application-layer payload. We use this dataset to test the performance of our method against HTTP DDoS attacks.

Estimation Criteria.
In the recurrent neural network training phase of our proposed AL-DDoS detection method, we use accuracy and loss as evaluation metrics, while in the International Journal of Intelligent Systems AL-DDoS attack detection phase, we use accuracy, F1, recall, and loss as evaluation metrics. In the comparison experiment with other methods, we use accuracy, F1 and recall as evaluation metrics. Te loss is calculated using following equation: In equation (15), ϑ t is the label value of the sample and λ t is the predicted value of the recurrent neural network.

AL-DDoS Attack Detection Results.
In this section, experiments are carried out on a computer with 64 bit Ubuntu OS (version: 20.04.1), TensorFlow (version: 1.14.0), Python (version: 3.6.2), and Keras (version: 2.2.5). To prove that the EDRN can better model application-layer protocol communication, we compared it with other RNNs, including LSTM [12], GRU [40], PLSTM [13], IndRNN [41], and DSTP-RNN [42]. In the recurrent neural network training phase of our AL-DDoS detection method, the maximum value of the epoch was set to 100.    Te lower the training loss and the higher the training accuracy, the better the performance of the recurrent neural network is. Terefore, the EDRN performed best on D 1 , D 2 , and D 3 datasets in the training phase. On the HTTP datasets, the average training accuracy and loss of the EDRN were 0.9978 and 0.0073.

Detection Results on HTTP
After training, we used the corresponding testing sets to evaluate the EDRN and other RNNs. Te test results are listed in Table 3, and as shown, the EDRN had the highest accuracy, F1, and recall, and the lowest loss on D 1 , D 2 , and D 3 datasets. Hence, the EDRN had the best performance in the HTTP DDoS attack detection phase. On the HTTP datasets, the average test accuracy, F1, recall and loss of the EDRN were 0.995, 0.991, 0.992, and 0.042, respectively.  Table 4. Compared with other RNNs, EDRN had a better performance in detecting SMTP food attacks.

Detection Results on CICDDoS2019 Dataset.
We compared the EDRN with other RNNs on the CICD-DoS2019 dataset. A comparison of the test results is shown in Table 5 and show that the EDRN achieved the best performance in detecting HTTP DDoS attacks on the CICDDoS2019 dataset.

Comparison with Existing Approaches.
In this section, we compare our proposed AL-DDoS attack detection method with several existing state-of-the-art approaches. Accuracy, F1 and recall are adopted as evaluation metrics.  Tables 6-8, respectively. Existing approaches use traditional statistical analysis or machine learning algorithms to detect HTTP DDoS attacks, while this study uses EDRN to construct an application-layer protocol communication model to detect HTTP DDoS attacks. Te EDRN is a novel recurrent neural network that has better performance than traditional statistical analysis and machine learning algorithms in sequence data mining. Terefore, our method outperforms existing approaches in detecting HTTP DDoS attacks.
Existing approaches do not consider the characteristics of application-layer protocol communication when detecting SMTP food or HTTP DDoS attacks. However, the communication process of the application-layer protocol can better refect the users' behavior. Tis study uses EDRN to describe the communication process of the applicationlayer protocol, which can capture the suddenness, randomness, and volume of protocol communication. Terefore, our method has better performance than existing approaches in detecting SMTP food attacks.

Discussion
Te accuracy of the application-layer protocol identifcation method has a great infuence on the performance of the proposed AL-DDoS attack detection method. We conducted an online test on the application-layer protocol identifcation method at the gateway of a real campus network, shown in Figure 9. Te duration of the test experiment was fve hours, and accuracy and recall were selected as evaluation indicators. Table 9 presents the identifcation results of some common application-layer protocols. Te test results show that for common application-layer protocols, the accuracy and recall of the method were both above 0.998. Terefore, the application-layer protocol identifcation method can meet the needs of AL-DDoS attack detection.
To improve the accuracy of the EDRN-based protocol communication model, we update the model parameters online. Specifcally, we collect training observation sequences of normal and AL-DDoS attacks online, and then train the model parameters at regular intervals, as shown in Figure 10.          However, it is difcult to defne the protocol keywords of emerging application-layer protocols. Terefore, our model cannot efectively detect AL-DDoS attacks based on emerging application-layer protocols. In future work, we aim to automatically analyze emerging application-layer protocols and defne their protocol keywords.

Data Availability
Te CICDDoS2019 dataset used to support the fndings of this study is a public dataset developed by the Canadian Institute for Network Security (CIC) in 2019. Te CICD-DoS2019 data can be downloaded from https://www.unb.ca/ cic/datasets/ddos-2019.html.

Conflicts of Interest
Te authors declare that they have no conficts of interest.