Recurrent Neural Network Model Based on a New Regularization Technique for Real-Time Intrusion Detection in SDN Environments

Software-dened networking (SDN) is a promising approach to networking that provides an abstraction layer for the physical network. is technology has the potential to decrease the networking costs and complexity within huge data centers. Although SDN oers exibility, it has design aws with regard to network security. To support the ongoing use of SDN, these aws must be xed using an integrated approach to improve overall network security. erefore, in this paper, we propose a recurrent neural network (RNN) model based on a new regularization technique (RNN-SDR). is technique supports intrusion detection within SDNs. e purpose of regularization is to generalize the machine learning model enough for it to be performed optimally. Experiments on the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets achieved accuracies of 99.5%, 97.39%, and 99.9%, respectively. e proposed RNN-SDR employs a minimum number of features when compared with other models. In addition, the experiments also validated that the RNN-SDR model does not signicantly aect network performance in comparison with other options. Based on the analysis of the results of our experiments, we conclude that the RNN-SDR model is a promising approach for intrusion detection in SDN environments.

1. Introduction e current Internet architecture has existed for almost three decades and is now becoming a progressively complicated system. e Internet lacks the capacity to accommodate continually changing requirements and the demanding nature of present day applications. Software-de ned networking (SDN) [1] was introduced as an architecture permitting unparalleled compliance and scalability in the implementation and con guration of network services. e segregation of the data and control planes a ords better control over tra c ow and exibility. Real-time information acquisition via the OpenFlow protocol [2] is made possible due to the ow-based nature of SDNs. However, the SDN architecture also contains numerous security challenges concerned with the control application interface, control plane, and control data interface [3]. Consequently, SDN security has become a major issue and has gained critical signi cance [4,5].
A signi cant network security tool is an intrusion detection system (IDS). An anomaly based IDS attempts to recognize deviations from a baseline model. Much research has been performed in the context of detecting anomalies in an SDN environment. While these researches showed great results, they are limited in their applicability. Techniques proposed to detect anomalies have included Bayesian networks, support vector machines (SVMs), and arti cial neural networks (ANN), but these proposals have su ered from excessive computational cost and high false alarm rate (FAR) [6]. Lately, traditional machine learning methods have been replaced by a new approach, called deep learning (DL), that gives better accuracy when compared with traditional machine learning. DL can extract deep features in order to obtain high-level features. In SDN environments (constrained resource network), DL has potential due to its adaptability.
Drawing on cutting-edge research in the field of anomaly detection, the recurrent neural network (RNN) is the most popular method of performing classification and other analysis on sequences of data. In addition, it is a powerful technique that can show remarkable results in sequence learning and improving the anomaly detection rate in an SDN environment. In this paper, we propose an RNN model based on a new regularization technique RNN-SDR. RNN-SDR is tested within an SDN controller against the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets. e major contributions of this paper are as follows: (1) We introduce the design and implementation of an IDS in an SDN environment using an RNN based on a new regularizer that decays the weights according to the standard deviation of the weight matrices and compares the results against its parents. To the best of our knowledge, this is the first model that has achieved a high accuracy for IDS in an SDN environment in terms of throughput and latency. Noteworthy, however, is that it is slightly slower than the Beacon controller. e rest of this paper is organized as follows. First, the applicability of deep learning in the domain of intrusion detection is established, followed by a review of related work in this field. e system developed in this research is described, and the datasets used to evaluate the system are presented.
e system's intrusion detection and network performance is presented, analyzed, and compared with the state of the art. Finally, the paper is concluded.

Deep Learning for Intrusion Detection
Deep learning is an advanced field of machine learning (ML) that allows the creation of models with discriminative powers that exceed other statistical ML methods. e basic algorithms of deep learning are deep neural networks (DNNs) that operate across several connected layers. e layers are linked in a way that sees each forward layer taking inputs from the previous layer and modifying those inputs in a hidden way. ese algorithms have the advantage of being able to extract discriminative features from data in a hierarchical fashion in a way that best represents the data without resorting to handcrafting.
For intrusion detection, features such as protocol_type, duration, service, and flag are fed to the neural network and pass through several layers. Every layer in the neural network works as a transformation of features. Each feature becomes more discriminative after passing through a hidden layer. e features pass through several hidden layers and, in the last output layer, the outcome of the neural network is compared with labels attached to the original data to determine whether the network has detected attack types such as DoS, Probe, and U2R. Due to its discriminative power, deep learning approaches have been used by many authors [7,8] for network intrusion detection, and still the area is open for quality research.

Related Work
Before the development of DNN variants, classical ML algorithms, such as random forest (RF), SVM, ANN, and k-nearest neighbors (KNN) were used by various researchers to develop IDSs [9][10][11][12]. However, these methods have inherent limitations. In particular, these focus on a large set of features of traditional networks that cannot be applied to SDNs.
Work on anomaly detection in SDN using flow-based IDS was employed in [13,14]. Self-organizing map (SOM), used by Braga et al. [13], is considered to be a light weight approach for detection of distributed denial of service (DDoS) attacks in SDNs. e accuracy of this approach was found to be very high based on six traffic flow features. Four traffic anomaly detection algorithms NETAD, maximum entropy, threshold random walk with credit-based (TRW-CB) rate limiting, and rate limiting were used in [14] in an SDN environment for anomaly detection. eir simulations showed that these algorithms produced promising results with low overhead in small networks. Other algorithms to detect DDoS attacks, such as SVM, were used in [15,16] and produced better results. An ensemble of graph theory algorithms based on KNN was used by ALEroud and Alsmadi to detect anomalous flow in SDNs [17]. An algorithm based on the variation of the entropy of the destination IP addresses of the flow in an SDN was proposed by Mousavi et al. [18] to detect early DDoS attacks. e sensitivity was about 96% for 250 packets at the start. Similarly, in [19], a DLbased approach using a stacked autoencoder was used. eir algorithm performance, evaluated on their own dataset, was quite satisfactory having high accuracy and low FAR. In [20,21], the authors applied DNN and a gated recurrent unit recurrent neural network (GRU-RNN) in an SDN environment. ey achieved an average accuracy of 75.75% for DNN and 89% for GRU-RNN using six basic features using the NSL-KDD dataset.

Methodology and System Description
In this section, we review the RNN and the new regularizer.
en, we describe, in detail, the architecture of the SDNbased IDS. Finally, we discuss the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets.

Recurrent Neural Networks.
e RNN architecture is the addition of sequential information to the feedforward neural network. e RNN performs the same task for each part.
is is why it is called a recurrent network; the output is dependent upon the previous computation. e hidden computation of RNN is computed as given below: where H t denotes the hidden state vector at time t; σ is the activation function, also known as the nonlinearity function; W is the hidden weight matrix; V is the hidden to hidden weight matrix; x t is the input vector at time t; and b H is the bias term.

A New Regularization
Technique. e regularization technique used in this paper is based on taking the standard deviation of the weight matrix and multiplying that by λ to yield the regularization term. e motivation behind this is to create an adaptive form of weight decay. e formalization is given in equations (2) and (3): where k is the number of rows in the weight matrix, i is the i th row of the weight matrix, and σ represents the standard deviation as given below: where λ is the regularization parameter that acts as a penalty to prevent weights from reaching high values during the training process and n is the number of columns in each i th row of the weight matrix. Training models were trained using the Nesterov ADAM optimizer, with tanh activation functions. e model was trained over 100 epochs with a batch size of 32. e labeled data were classified with a feedforward network.

System Architecture.
e proposed IDS algorithm is embedded in the SDN controller as an application since the SDN controller is not designed to analyze network traffic in depth like an IDS. is paper focuses on the use of the SDN paradigm as a network infrastructure for the IDS. e architecture of SDN is given in Figure 1. is architecture has three parts: (i) Flow Collector. is module collects the packet from the flow and extracts all the information required for the trained IDS algorithm such as the protocol used, the IP addresses, and the port. All these details are passed to the controller. (ii) Anomaly Detector.
is module collects the data from the flow collector and loads the proposed IDS algorithm to check the packet for anomalies. e proposed IDS based on the new regularizer is the heart of this module.
(iii) Anomaly Mitigator. is module is dependent upon the decision of the anomaly detector. e anomaly mitigator will either drop or forward the packet based on the results conveyed by the detector.

Datasets.
In this paper, we targeted three benchmark datasets to test the proposed model. A brief description of these datasets is given below: (1) KDD Cup 1999 Dataset. is was the first dataset used in the third international KDD tools competition that was held with the collaboration of KDD-99. e data consist of "normal" data and anomalous data broken up into four different classes based on the attack types given in Table 1. In the dataset, features from the 36th column to the 40th column represent general purpose features, while the 41st feature to 47th feature are connection features [22]. e attacks, which are known by labels, are classified into the nine types given below:

Data Preprocessing.
e RNN only takes numerical data for training and testing. erefore, the first step is to convert textual and nominal data into numerical data. For this purpose, the following steps were performed: ( where n represents the number of records and X represents a specific column in the dataset. Duplicate records were removed from the dataset to prevent the classifier from producing biased results.

Evaluation Metrics.
For all datasets, the AUC-ROC curve, true positive rate (TPR), true negative rate (TNR), false positive rate (FPR), and average validation accuracy were computed. Apart from these measures, the F-Score values were also calculated. All these measures can easily be calculated from the confusion matrix that represents the true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). e mathematical representation of each measure is given below: (i) Accuracy. It is the ratio between true intrusions detected vs overall detection: (ii) True Positive Rate. It is the percentage of intrusions that were actually intrusions and correctly detected: (iii) True Negative Rate. It is the percentage of packets that were nonintrusions and correctly classified as nonintrusions: (iv) Precision. It is the ratio between the true anomalous packets vs total packets that were marked as intrusions: (v) F-Score. It is the harmonic mean of true positive rate and precision: e AUC-ROC curve had a value of 1.0, as shown in Figures 6 and 7, respectively.

Detection Performance Measurement.
We compared the performance results of our proposed model with results from the literature. From Table 2, we can clearly see that our proposed method outperforms other models in all the evaluation metrics for all classes and has the potential to be used for anomaly detection in an SDN environment. In Table 2, we compare precision (Pr), true positive rate (TPR), and F-Score for legitimate and anomaly classes. We computed all these measures for the three datasets we used. For legitimate classes, precision was 94.3%, 92.2%, and 93.5% for the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets, respectively. Precision tells us the positive predictive values. In other words, it shows the ratio of the total positive values in the dataset, and the positive instances predicted by the classifier. erefore, we can say that our classifier is good enough to detect nonanomalous classes. Similar results were achieved for anomalous classes. e TPR measure represents the sensitivity of the classifier. e values indicate that the model is very accurate. Sometimes, TPR and precision are not considered good measures. F-Score provides an alternate view representing the harmonic mean of precision and sensitivity. e data in Table 2 indicate that our model achieved higher values for F-Score and, therefore, can be considered efficient. Hence, our model is more accurate and precise than other methods.
In addition, we compared our results with classical and neural network based methods used by different researchers. In terms of accuracy, our model outperforms all listed methods used for anomaly detection in an SDN environment (Table 3). For the three datasets, we achieved satisfactory accuracy.
is accuracy is high for the cost of overhead and latency as can be seen from Figures 8 and 9. e throughput is low and latency is high, but the intrusion detection accuracy is also high.

Network Performance Analysis
In this section, we evaluate the effect of our RNN on network performance. e evaluation testbed is described in the first part, and then the network performance evaluation is presented.

Experimental Setup.
e mininet emulator was used to test the learning model in the network as an intrusion detection system based on the new regularization technique.
Mininet is an open source Python-based network emulator that is used to create a virtual networking topology connecting virtual hosts via various devices such as switches, links, and controllers. It runs Linux network software and can support OpenFlow for custom routing and SDN.
As mininet needs to be installed on a Linux server, we chose Oracle VM VirtualBox to carry out our simulations. e simulation was conducted on a system with 64-bit Ubuntu 18.04 LTS on a Core-i7 with 16 GB of RAM. e performance of the controller embedded with the proposed model was tested on various numbers of OpenFlow switches emulated by Cbench in mininet. e performance of the proposed model was compared with the POX and Beacon OpenFlow controllers after training on the NSL-KDD and UNSW-NB15 datasets with the throughput and latency running modes.

Analysis of Results.
We conducted an analysis of our results in terms of throughput and latency and then compared it with existing POX and Beacon controllers. Figure 8 shows the throughput for POX, Beacon, and our proposed work. Based on our analysis, there is a slight difference in throughput between Beacon and the proposed work when the number of switches is 8, 128, or 256. When compared  with the POX controller, the performance decreased by 2.702% for 8 switches. For 128 and 256 switches, the performance dropped 3.32% and 3.51%, respectively, when compared with the POX controller. We observed minimal difference in the performance of our proposed algorithm when compared with the Beacon controller. e average decreases in throughput of 0.63%, 1.22%, and 2.33% are seen, respectively, for 8, 128, and 256 switches. Hence, our proposed model gave satisfactory results in terms of throughput. Figure 9 illustrates latency on the NSL-KDD dataset. In this figure, it can be seen that performance decreases regularly as the number of switches increases. is decrease can be ascribed to the extra responsibility of analyzing and checking packets.
Similarly, the simulations were carried out for the model trained on the UNSW-NB15 dataset, and the results were analyzed in terms of throughput and latency. e results were compared with existing POX and Beacon controllers. Figure 10 shows the throughput for POX, Beacon, and our proposed work.
As can be seen, there is a significant difference in throughput of the Beacon controller and the proposed work when the number of switches is 32, 64, or 256. When compared with the POX controller, the performance decreased by 1.4% for 32 switches. As for 128 and 256 switches, the performance dropped 3.4% and 3.6%, respectively, when compared with the POX controller. When compared with the Beacon controller, there is minimal difference in the performance of our proposed algorithm. Average decreases in throughput of 1.6%, 1.3%, and 2.4% are seen, respectively, for 32, 128, and 256 switches. Hence, from these results, it is seen that our proposed model, trained on the UNSW-NB15 dataset, slightly lags in throughput but with a higher accuracy than the other models (Table 3).
For the latency presented in Figure 11, we observed the same findings as above.
e performance decreases as the number of switches increases, just like for the other models. is decrease is due to the extra responsibility for checking and   Method Accuracy (%) DNN [20] 75.75 SVM [23] 69.53 NB trees [23] 82.02 GRU-RNN [21] 89 analyzing packets for intrusions. After analyzing the results, we can say positively that the proposed model has the potential for real-time anomaly detection in an SDN environment.
Based on the analysis of the results above, we can see that there is a trade-off between the throughput or latency and security. When one of them increases, the other will decrease. Hence, it is the responsibility of the network administrators to tune the network according to its requirements, either to make it more secure by adding the algorithm for analyzing the packet or make it fast.

Conclusion
In this paper, we present an anomaly based IDS in an SDN environment using an RNN with a new regularization algorithm. We train the RNN-SDR on the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets and show that our model outperforms other state-of-the-art algorithms with an accuracy of 99.5%, 97.39%, and 99.9% for the KDD Cup 1999, NSL-KDD, and UNSW-NB15 datasets, respectively. Our scheme uses a minimum number of features compared with other state-of-the-art approaches. is makes the model more computationally efficient for real-time detection. In addition, the network performance evaluation shows that our proposed approach slightly affects the controller performance.
is implies a trade-off in either selecting security or speed. Nevertheless, our model is practical for implementation in the context of an SDN.

Data Availability
All relevant data are included within the article.

Conflicts of Interest
e author declares no conflicts of interest.