With the development of the fifth-generation networks and artificial intelligence technologies, new threats and challenges have emerged to wireless communication system, especially in cybersecurity. In this paper, we offer a review on attack detection methods involving strength of deep learning techniques. Specifically, we firstly summarize fundamental problems of network security and attack detection and introduce several successful related applications using deep learning structure. On the basis of categorization on deep learning methods, we pay special attention to attack detection methods built on different kinds of architectures, such as autoencoders, generative adversarial network, recurrent neural network, and convolutional neural network. Afterwards, we present some benchmark datasets with descriptions and compare the performance of representing approaches to show the current working state of attack detection methods with deep learning structures. Finally, we summarize this paper and discuss some ways to improve the performance of attack detection under thoughts of utilizing deep learning structures.
The continuous development and extensive usage of Internet benefit numerous network users from a quantity of aspects. Meanwhile, network security becomes much more important with wide usage of network. Network security is closely related to computers, networks, programs, various data, and so forth, where the purpose of defense is to prevent unauthorized access and modification [
Over the past several years, researchers have used various kinds of machine learning methods to classify network attacks without prior knowledge of their detailed characteristics. However, traditional machine learning methods are not capable of providing distinctive feature descriptors to describe the problem of attack detection, due to their limitations in model complexity. Recently, machine learning has made a great breakthrough by simulating human brain with structure of neural networks, which are named deep learning methods for their general architecture of deep layers to solve complicated problems. Among these successful applications, Google’s AlphaGo is one of the most outstanding trials for the game of “go,” involving the strength of a typical kind of deep learning structure, that is, convolutional neural networks.
Since deep learning is complex in its original structures and domain-oriented applications, this paper is written to explain so for those who aim to study in the field of network security by utilizing deep learning methods. Essentially, there exists a quantity of previous work focusing on attack detection using deep learning techniques. Among them, several literature reviews [
In fact, all the above review papers have their own emphases, such as security applications, attacks type, datasets, or databases. Unlike former methods, we intend to build our paper on the basis of deep learning models, thus paying special attention to attack detection methods built on different kinds of deep learning architectures. Furthermore, we offer a fair comparison and our own specified analysis on performance of representing approaches based on benchmark datasets. We believe our paper could offer a more understandable reading resource for readers, who are interested in how different deep learning architectures affect the area of attack detection.
In the paper, we attempt to build up basics for future research through a thorough literature review of deep learning related approaches in the field of attack detection. More specially, firstly, we summarize the fundamental problems, classify the previous methods, and review the useful methods for beginners. Then, we briefly introduce the great progress on deep learning techniques in cybersecurity. By replacing traditional machine learning methods with deep learning structures, researchers have proposed a quantity of novel algorithms to greatly improve the performance referring to higher detection rate and lower false alarm rate. Afterwards, we compare and analyze the performance of some representative deep learning approaches on benchmark dataset. Finally, we make a summary of the problems to be solved and future direction of deep learning method to improve attack detection.
We organize the rest of our work as follows. Section
In order to provide an overview of effective attack detection based on deep learning techniques, it is essential to introduce background knowledge. We thus first give a brief introduction to the concepts of attack detection, which could offer a basic recognition for new learners. Afterwards, we make a brief representation of successful applications for cybersecurity.
Attacks could be recognized as the attempts to bypass security policies of the system, which gives attackers easier access to obtain or modify information, even destroying the system. With technologies developing on wireless communication systems, serious threats to network security, especially security of wireless communication systems, have been proposed by more frequent network attack activities, due to openness characteristics of wireless channels. Since we are now in machine learning and big data epoch [
To deal with such attack threads to cybersecurity, researchers have proposed many solutions [
In recent years, machine learning is developing with incredible speed. Among different machine learning methods, deep learning structures construct artificial neural networks to simulate interconnecting neurons of human brains, which brings distinctive power to solve complicated problems. Researchers thus adopt various deep learning methods to operate attack detection, resulting in significant achievements. However, there are still many unsolved problems due to the limitation of deep learning methods. It is essential to make a summarization of how former methods use deep learning methods to detect attacks, which could bring new ideas for future developments.
Since deep learning shows great potential in constructing security applications, it has been widely used in cybersecurity [
Intrusion detection system could detect malicious activities by collecting and analyzing network behavior, security log, and other information available on the network and among connected computers [
Traditional intrusion detection system is firstly built on misuse of intrusion detection technology, which mainly extracts characteristics or rules of intrusion behavior. After appearance of abnormal behavior detection technology with traditional machine learning models, intrusion detection system evolutes to carry out probability statistical modeling for normal behaviors, which could analyze and alarm abnormal behaviors with large deviation. However, such system may have unsatisfactory results, due to low capability in problem space defining and complexity in modeling malicious activities.
To further overcome shortcomings brought by traditional machine learning methods, deep learning technology is performed to analyze network packets, which progressively changes the mainstream idea of intrusion detection from blacklist to white model. A new NIDS deep learning model is proposed by Shone et al. [
Malware is designed to reduce performance and vulnerability of a computer, server, or computer network. Under extreme situations, Malware will result in destruction of the entire system. Malware requires to be implanted into the target computer at first. Afterwards, it could execute code, script, active content, and other software automatically or following orders from planters. It is noted that such software or codes could be categorized in forms of computer viruses, worms, Trojans, spyware, advertising software, and malicious codes.
We divide the malware detection methods into two categories, that is, signature-based and anomaly-based detection. Traditional antivirus software can be included in the first category, which detects malicious files based on file signature. However, slightly deformed malicious codes could be bypassed, leading to a large number of false positives. Later, technologies of sandbox and virtual machine appear to detect dynamic behaviors of virus, which can be regarded as big progress from static detection to dynamic analysis, greatly improving the ability to detect unknown malicious code.
For example, in [
DGAs are popular to be used as malware tools to create a great quantity of domain names for tracking communication with C2 server. Different domain names make it difficult to use standard technologies like blacklist or sink-holing to prevent malicious domain names. DGAs are often used in various network attacks, such as spam, personal data theft, and DDoS attacks.
By applying deep learning technologies, DGA is capable of detecting domain names from the perspective of syntax analysis. Specifically, such novel algorithms could not only compare word frequency with normal domain names by
Considering the current deep learning methods for attack detection [
Categorization of the current deep learning methods for attack detection.
Adopting different kinds of deep learning algorithms could bring variant advantages for attack detection methods. Supervised learning based methods often result in high accuracy, due to quantity of information provided by manually labeled samples. Without sufficient knowledge from labeled data, unsupervised learning based methods are generally low in performance. However, manually labelling is a time-consuming task, especially for complex attacks. There even exist cases that cannot be described by a simple label, due to the inherent complexity of real-world network attacks. Therefore, unsupervised learning based methods could perform well without prior knowledge of attacks, which is an obvious advantage. Hybrid methods decrease the number of training samples and maintain a relatively high performance, which is suitable to deal with variant attack situations. However, it is generally complex in structure and high in computing time, which prevents its wide usage.
Let us first introduce the architecture of AE, which can be regarded as a data compression algorithm with neural network structure. In fact, it is capable of firstly compressing the input into feature space representation and then reconstructing representation into the output. Since AE can be regarded as a typical representing learning algorithm, it is widely used for dimension reduction and outlier detection. Researchers in cybersecurity also adopt AE to represent abnormal behaviors in its compressed feature space, which brings the advantage of dynamical representation for unknown category of attacks.
To extract informative feature descriptors from original network traffic data, Zhang et al. [
Following the idea to facilitate intrusion detection with AE models, Shone et al. [
Network structure of Shone et al. [
Since AE is capable of learning potential representation of unknown attacks, Yousefi-Azar et al. [
Since collected network raw data can be unbalanced in distribution, Farahnakian et al. [
In order to construct a flexible system for detecting intrusion attacks, Javaid et al. [
Following such idea, Papamartzivanos et al. [
Feature extraction is one of the major issues to address for attack detection. Regarding AE as a structure for information compressing and feature generation, utilizing AE brings advantages of automatical and dynamical feature construction, resulting in high accuracy for detecting predefined attacks existing in datasets. Facing variant and unknown attacks which are the main characteristics in cybersecurity, researchers have emphasized self-learning strategies to make AE more powerful.
Deep belief network (DBN) could be divided into two categories, that is, restricted Boltzmann machines (RBM) with several layers of unsupervised learning networks and backpropagation neural network (BPNN or BP) with one such layer. Essentially, RBM is a random structure of generating neural network, which is undirected graph model composed of different layers constructed by visible neurons and hidden neurons. Due to the natural characteristics of RBM, it is effective for DBN to train layer by layer.
Early, Gao et al. [
Afterwards, Ding et al. [
Workflow of opcode malware detection approach proposed by Ding et al. [
Since behavioral characteristics of ad hoc networks have brought great challenges to network security, Tan et al. [
To explore the capabilities of DBN for detecting intrusion attacks, Alom et al. [
Many trials have been applied in using DBN for intrusion detection. However, there still exist many unsolved problems, such as redundant information, easy to trap into local maximal. To solve these problems, Zhao et al. [
Regarding real-time attacks detection as the biggest challenge of intrusion detection, Alrawashdeh and Purdy [
Because the traditional intrusion detection approaches face difficulties dealing with high-speed network data and cannot detect the unknown attacks at present, Zhang et al. [
Due to property of discovering inherent pattern of data to generate new samples, generative adversarial network (GAN) is one of the most promising unsupervised learning methods proposed in recent years. The main inspiration of GAN comes from the idea of zero-sum game. When it is applied to deep neural network, it keeps playing games between generator
Even though GAN is new in conception and hard in the training process, researchers successfully build several attack detection applications by regarding it as basic structure. For instance, Erpek et al. [
Utilizing machine learning technology to perform phishing detection, that is, URL of fake web address, is popular, due to its high effectiveness and real-time response. However, adversary may bypass URL classification algorithm by modifying components. To solve this problem, AlEroud and Karabatis [
Overview of steps for the GAN model proposed by AlEroud and Karabatis [
GAN is not often used for attack detection field. In fact, GAN is in fast developing in terms of structures, algorithms, and so forth. At present, GAN have shown promising results in many domains, which lead us to believe this proposing new technique to synthesize attempts is quite significant in creating a defensive mechanism. Such novel defensive mechanism can further complete quantity of tasks, such as preventing zero-day phishing attempts, performing opinion spam, and detecting intrusion attacks. Therefore, we think there exists a broad research space to connect GAN structure with attack detection filed.
DNN is recognized as multilayer perceptron due to characteristic of multiple hidden layers. Such multilayer feature brings advantage to express complex functions with fewer parameters, which makes DNN capable of facilitating tasks of feature extraction and representation learning. Essentially, there exist three categories of layers in DNN. Generally speaking, we regard the first layer as input layer, the last layer as output layer, and middle layers as hidden layers.
To provide a solution to network security problem, Tang et al. [
To enhance ability of DNN, Li et al. [
Architecture design of HashTran-DNN model proposed by Li et al. [
Challenges arise motivated by the fact that malicious attacks are constantly varying and occur on very large volumes which require scalable solutions. To meet this challenge, a DNN structure with a scalable and hybrid design is proposed by Vinayakumar et al. [
For network administrator, it is an urgent task to prevent the invasion of malicious network hackers and keep the network system and computer in a safe and normal operation state. Peng et al. [
CNN involves convolution computation and depth structure, which is a representative and commonly used techniques in deep learning domain. Specifically, CNN uses multilayer perception variant design requiring minimal preprocessing. The basic structure of CNN is composed of input and output layers and multiple hidden layers which include convolution, pooling, and full connection layer. Compared with other classification algorithms, CNN uses relatively less preprocessing and is independent of feature design containing prior knowledge, which are its main advantages.
Convolutional neural network has been applied to network security field with much promising progress. For example, Kolosnjaji et al. [
To detect attack indicators in advance, Saxe and Berlin [
Malicious web shell detection is an important means to protect network security. Aiming at analysis of HTTP requests, Zhang et al. [
To achieve robust performance in attack detection with CNN structure, an end-to-end encrypted traffic classification method based on one-dimensional CNN is presented by Wang et al. [
Workflow of the traffic analysis approach proposed by Wang et al. [
To solve the diversity attack of wireless network traffic and improve the detection ability of malicious intrusion in wireless network, an intrusion detection method based on improved convolutional neural network is proposed by Yang and Wang [
Low rate denial of service (LDOS) attacks reduce the performance of network services, and it is difficult to distinguish the attack behavior from the normal traffic. Thus, a new detection method of LDOS attack based on multifeature fusion and convolutional neural network (CNN) is proposed by Tang et al. [
Since the output of DNN and CNN only considers the influence of the current input without considering information from the previous and future time, they could achieve significant performance on the classification or recognition tasks without time-varying characteristics. Involving time-dependent data, RNN is proposed as a special category of neural network structures, which is designed with “memory” function to maintain previous content. In fact, such design feature coincides with the idea that “human cognition is based on the past experience and memory.” RNN is thus good at dealing with time-series information. However, there are still some problems in structure design of RNN like gradient disappearance or gradient explosion, which leads failure to remember or model long-time dependence. Therefore, researchers develop LSTM and GRU with gates design and memory cell, which successfully keep long-time relationship unforgotten by passing through important components of information flow.
Early, Staudemeyer [
Later, Krishnan and Raajan [
Similarly, Yin et al. [
Since LSTM solves the long-term dependency problem and overcomes the vanishing gradient drop during training, Kim et al. [
To reduce high false alarm rate achieved by the former methods, a system-call analysis method is proposed by Kim et al. [
Structure design of Kim et al. [
GRU is a variant of LSTM, in which softmax function is used as the final output layer. Moreover, GRU uses cross-entropy function to calculate its losses. Based on GRU structure, Agarap [
In this subsection, we aim to emphasize on the hybrid category of methods on attack detection, which are designed with the idea of integrating advantages of different deep learning structures.
Early in 2015, Li et al. [
Later in 2017, Ludwig [
Following the idea of fusing classifiers to obtain better results, Li et al. [
Workflow of the hybrid model proposed by [
In order to detect network attacks effectively, Liu et al. [
Most recently in 2019, Zhang et al. [
Many public datasets are popular to prove and compare efficiency and effectiveness among different attack detection methods. Among them, we list two famous benchmark datasets, that is, KDDCup 99 and NSL-KDD, which are widely used in the academic research to evaluate the ability to detect attacks.
Despite the fact that there exist some drawbacks like containing a great deal of redundant training and testing data, KDDCup 99 dataset is famous in the field of cybersecurity. It includes both labeled training data and unlabeled test data, which correspond to seven and two weeks of data originated from DARPA′98 IDS evaluation program [
Five categories of labels are contained in the dataset which are normal, DoS, Probe, R2L and U2R, that is, short for DoS, Probe, R2L, and U2R, where normal refers to normal traffic instances, Dos is an attack in which the attacker tries to make the target machine stop providing service or resource access to system, Probe represents surveillance and probing, and R2L refers to the unauthorized access while there is an illegal access from the remote machine to local one and represents that there is an unauthorized access to local superuser privileges by local unprivileged user. In Table
Category of 22 different attacks contained by KDDCup 99.
Class label | Attack name |
---|---|
DoS | back, land, neptune, pod, smurf, teardrop. |
Probe | ipsweep, nmap, portsweep, satan. |
R2L | ftp_write, guess_passwd, imap, multihop, phf, spy, warezclient, warezmaster. |
U2R | buffer_overflow, loadmodule, perl, rootkit. |
In KDDCup 99 dataset, each record has 41 features in total including basic features, content features, and traffic features as shown in Table
Feature set for each instance in KDDCup 99 dataset.
No. | Features | Types |
---|---|---|
1 | Duration | Continuous |
2 | protocol_type | Symbolic |
3 | Service | Symbolic |
4 | Flag | Symbolic |
5 | src_bytes | Continuous |
6 | dst_bytes | Continuous |
7 | Land | Symbolic |
8 | wrong_fragment | Continuous |
9 | Urgent | Continuous |
10 | Hot | Continuous |
11 | num_failed_logins | Continuous |
12 | logged_in | Symbolic |
13 | num_compromised | Continuous |
14 | root_shell | Continuous |
15 | su_attempted | Continuous |
16 | num_root | Continuous |
17 | num_file_creations | Continuous |
18 | num_shells | Continuous |
19 | num_access_files | Continuous |
20 | num_outbound_cmds | Continuous |
21 | is_hosts_login | Symbolic |
22 | is_guest_login | Symbolic |
23 | Count | Continuous |
24 | srv_count | Continuous |
25 | serror_rate | Continuous |
26 | srv_serror_rate | Continuous |
27 | rerror_rate | Continuous |
28 | srv_rerror_rate | Continuous |
29 | same_srv_rate | Continuous |
30 | diff_srv_rate | Continuous |
31 | drv_diff_host_rate | Continuous |
32 | dst_host_count | Continuous |
33 | dst_host_srv_count | Continuous |
34 | dst_host_same_srv_count | Continuous |
35 | dst_host_diff_srv_rate | Continuous |
36 | dst_host_same_src_port_count | Continuous |
37 | dst_host_srv_diff_host_rate | Continuous |
38 | dst_host_serror_rate | Continuous |
39 | dst_host_srv_serror_rate | Continuous |
40 | dst_host_serror_rate | Continuous |
41 | dst_host_srv_rerror_rate | Continuous |
NSL-KDD is famous as a new development of KDDCup 99 dataset, which comes out to reduce shortcomings of the previous dataset. Specifically, it not only removes redundant data from the training and test data to achieve more accurate detection rate but also officially sets the number of records in both training and test data. Moreover, different difficulty level group has different number of records, which is inversely proportional to the percentage of that in the primary KDD dataset. Hence, evaluations and comparisons among different learning technologies become more effective and obvious.
NSL-KDD and KDDCup 99 dataset are similar in structure, where both of them are divided into four attack types as mentioned before. NSL-KDD dataset is divided into two parts: KDDTrain+ and KDDTest+, where we show the specific numbers corresponding to each attack type in Table
Records distribution in training and test data [
Class | KDDTrain+ | KDDTest+ |
---|---|---|
Dos | 45927 | 74588 |
Probe | 11656 | 2421 |
R2L | 995 | 2754 |
U2R | 52 | 200 |
In this subsection, we describe 7 measurements including accuracy (ACC), precision (PR), true positive rate (TPR), recall (RE), false positive rate (FPR), true negative rate (TNR), and F1-score. Firstly, we define several items, where true positive (TP) and false negative (FN) refer to attack data correctly classified or not, respectively, and false positive (FP) and true negative (TN) are normal data which are classified as normal or attack, respectively. Afterwards, we define measurements as follows:
In Table
Quantitative evaluation of listed attack detection methods using different deep learning structures, where ID, MD, and TI represent network intrusion detection, malware detection, and traffic identification, respectively.
DL | Method | Usage | Dataset | ACC (%) | PR (%) | FPR (%) | FS |
---|---|---|---|---|---|---|---|
Convolutional AE | Yu et al. [ | ID | CTU-UNB | — | 98.44 | — | 0.980 |
Sparse AE | Javaid et al. [ | ID | NSL-KDD | 98.30 | — | — | 0.990 |
AE | Pamartzivanos et al. [ | ID | KDDCup 99 | 77.99 | 80.00 | — | — |
SAE | Farahnakian and Heikkonen [ | ID | KDDCup 99 | 94.71 | 94.53 | 0.42 | — |
AE | Shone et al. [ | ID | NSL-KDD | 89.22 | 92.97 | 10.78 | 0.910 |
Sparse AE | Shone et al. [ | ID | KDDCup 99 | 97.85 | 99.99 | 2.15 | 0.980 |
AE | Aygun and Yavuz [ | ID | NSL-KDD | 93.62 | 91.39 | — | 0.938 |
Denoising AE | Aygun and Yavuz [ | ID | NSL-KDD | 94.35 | 94.26 | — | 0.940 |
Sparse AE | Gharic et al. [ | ID | NSL-KDD | 96.45 | 95.56 | — | 0.965 |
AE | Yousefi-Azar et al. [ | ID, MD | NSL-KDD | 83.34 | — | — | — |
DBN | Gao et al. [ | ID | KDDCup 99 | 93.49 | 92.33 | 0.76 | — |
DBN | Ding et al. [ | MD | Netflow | 96.10 | — | — | — |
DBN | Qu et al. [ | ID | NSL-KDD | 95.25 | — | — | — |
DBN | Tan et al. [ | ID | Netflow | 97.60 | — | 0.90 | — |
DBN | Alom et al. [ | ID | 40% NSL-KDD | 97.50 | — | — | — |
DBN | Zhao et al. [ | ID | KDDCup 99 | 99.14 | 93.25 | 0.62 | — |
DBN | Alrawashdeh and Purdy [ | ID | 10% KDDCup 99 | 97.90 | 97.81 | 2.10 | 0.975 |
DNN | Tang et al. [ | ID | NSL-KDD | 91.70 | 83.00 | — | — |
DNN | Vinayakumar et al. [ | ID, MD | KDDCup 99 | 93.00 | 99.00 | 0.95 | |
DNN | Wang et al. [ | ID | KDDCup 99 | 95.45 | — | — | — |
CNN | Kolosnjaji et al. [ | MD | Netflow | — | 93.00 | — | 0.920 |
CNN | Saxe and Berlin [ | MD | Netflow | 92.00 | — | 0.10 | — |
CNN | Wang et al. [ | ID | ISCX | — | 97.30 | — | 0.960 |
CNN | Wang et al. [ | TI | Netflow | 99.41 | — | — | — |
CNN | Tang et al. [ | ID | NS2 simulation | 97.1 | — | — | — |
CNN | Yang and Wang [ | ID | KDDCup 99 | 95.36 | 95.55 | 0.76 | 0.930 |
LSTM | Staudemyer [ | ID | 10% KDDCup 99 | 93.85 | — | 1.62 | — |
RNN | Krishnan and Raajan [ | ID | KDDCup 99 | 77.55 | 84.60 | — | 0.730 |
RNN | Yin et al. [ | ID | NSL-KDD | 83.28 | — | — | — |
LSTM | Kim et al. [ | ID | 10% KDDCup 99 | 96.93 | 98.80 | 10.00 | — |
LSTM | Le et al. [ | ID | KDDCup 99 | 97.54 | 98.95 | 9.98 | — |
LSTM | Kim et al. [ | ID | KDDCup 99 | 99.80 | — | 5.50 | — |
GRU | Agarap [ | ID | Netflow | 84.15 | — | — | — |
Ensemble | Ludwig [ | ID | NSL-KDD | 92.50 | 93.00 | 0.92 | — |
AE, DBN | Li et al. [ | ID | KDDCup 99 | 92.10 | — | 1.58 | — |
DCNN | Naseer et al. [ | ID | NSL-KDD | 85.00 | — | — | 0.980 |
PL-CNN | Liu et al. [ | ID | DARPA1998 | 99.36 | 90.56 | — | 0.910 |
PL-RNN | Liu et al. [ | ID | DARPA1998 | 99.98 | 99.98 | — | 0.990 |
From Table
Essentially, it is interesting to point out that RBMs and AEs are popular in intrusion detection because we can pretrain the RBMs and AEs with unlabeled data and fine-tune with only a small number of labeled data. Regarding ACC values achieve by listed methods as the first evaluation index due to its completeness, we can find the best performance achieved by attack detection methods on KDDCup 99 dataset, that is, 99.8% achieved by Kim et al. [
We can observe that performance of AE-based methods is uneven, where most of the improved AE-based methods obviously perform better than traditional AE-based methods. This is due to the fact that the structure of AE might lose important information during compression process. Meanwhile, improved AE could better capture important and informative parts of input data with additional designs. Similarly, LSTM-based and GRU-based methods outperform RNN-based methods, due to their features in structure design of gates and memory cells. In fact, such intelligent designs bring advantage of capability of maintaining long-term information, thus better modeling long-time relationship.
Due to the large number of DBN- and RNN-based (e.g., LSTM and GRU) methods for attack detection proposed by researchers, we would like to regard DBN- and RNN-based methods as typical unsupervised and supervised algorithms, respectively, where we further compare them to show the advantages and disadvantages of both groups.
Essentially, RNN could remember information of the last several moments and then apply it in the calculation for the current unit, which introduces temporal information to help more accurate classification. However, RNN can be powerful structure with sufficient training instances, where attack data especially those unknown attacks are hard to be achieved. Meanwhile, DBN is capable of automatically discovering feature pattern from input data. Moreover, the unsupervised DBN network is less likely to be overfitting than those supervised methods due to its pretraining procedure, where DBN could learn inherent descriptions on abnormal behaviors or attacks by learning from unlabeled data. This feature of generated ability makes DBN, that is, a typical unsupervised learning method, fit with real environment of network security. Last but not least, DBN is easy to be trained, fast to be converged, and low in running time, due to less hidden layers compared with deep structures of CNN or so. Therefore, we think unsupervised learning methods could produce better classification results than supervised learning methods, especially when facing small, imbalanced, or redundant dataset.
Deep learning uses cascaded layers in a hierarchy structure to perform data processing, which results in significant results in domains of unsupervised feature learning and pattern recognition. Inspired by performance of deep learning methods, we believe deep learning is important for field of network security, so as to review the current deep learning methods for attack detection. We analyze recent methods, classify them according to different deep learning techniques, and compress the performance of the most representative methods.
Over the past few years, research on how to apply deep learning methods on attack detection has made a great progress. However, many problems still exist. Firstly, it is challenging to modify deep learning methods as real-time classifiers for attack detection. In most of the previous works, they only reduce feature dimension for less computation cost during phase of feature extraction. Secondly, most of the deep learning techniques are appropriate for analysis of image and pattern recognition. Thus, how to conduct the classification of network traffic reasonably with deep learning techniques will be an interesting issue. Thirdly, with more data involving the experiments, the classification results will be better [
The data used to support the findings of this study were supplied by Dabao Wei under license and so cannot be made freely available. Requests for access to these data should be made to Yirui Wu (
The authors declare that they have no conflicts of interest.
This work was supported by National Key R&D Program of China under Grant 2018YFC0407901, the Fundamental Research Funds for the Central Universities under Grant B200202177, the Natural Science Foundation of China under Grant 61702160, and the Natural Science Foundation of Jiangsu Province under Grant BK20170892.