DI-ADS: A Deep Intelligent Distributed Denial of Service Attack Detection Scheme for Fog-Based IoT Applications

Faculty of Engineering (Computer Science and Engineering), BPUT, Rourkela 769015, Odisha, India Department of Computer Science and Engineering, Parala Maharaja Engineering College (Govt.), Berhampur 761003, Odisha, India Amity School of Engineering and Technology, Amity University, Noida 201301, UP, India Department of Computer Science & Engineering and University Center for Research and Development, Chandigarh University, Mohali 140413, Punjab, India


Introduction
IoT is a profoundly arising stage in the present progressing world. It is planned with di erent connected devices such as smart vehicles, smart healthcare, smart grid, surveillance, etc. Its prominence of monstrous use brought about many assaults on connected devices in the trading of resources. Because of its centralized nature, more openness to hazards like unpathed weaknesses, weak authentication, and weak APIs may happen. Information on IoT devices is controlled by sensors and actuators over the cloud which gives onrequest administration to the end clients. Nonetheless, offering types of assistance to IoT by the cloud has its issues in the method of high latency, data security, and protection and interruption [1][2][3][4][5][6].
To mitigate the issues of IoT with the cloud, another layer is developed called fog, the arising innovation that plays between the cloud and end devices. It disseminated a decentralized computing model which helps in offering types of assistance to the end clients by latency and bandwidth utilization and makes information nearer to the edge of the network. Still on, fog devices are profoundly inclined to attacks that get compromised with data privacy. e most occurring of attacks are DDoS, IP Spoofing, Man-in-Middle attack, and port scan attack [7]. To emerge from this, many attack detection mechanisms are required in fog architecture. On providing benefits to the end-users such as low latency and mobility an intermediate layer is built between IoT and cloud [2]. To bring out communication among the connected devices, IoT devices require communication protocols. For instance, consider a surveillance system where huge data will be collected and stored in the cloud for future retrieval and IT services, but we require only summarized data in the cloud to prevent the wastage of bandwidth and storage capacity. To achieve low latency, effective utilization of bandwidth, and storage we require non-cellular-based network protocols for communication (COAP, MQTT, LoRa, and LoRaWAN) [5,[8][9][10].
IoT devices generated data is communicated and stored near to devices employing fog using above protocols for fast accessing of resources. In this case, there is more chance of happening attacks on fog nodes/fog layer. So, we need to define attack detection in the fog layer to create a secure system. Due to the increase in usage of the Internet and a large amount of data transfer, a greater number of anomalies may be present. In parallel, the cause of attacks is also increasing consistently. Many organizations are continually working on network attack detection to provide secure services to the end-users. e usage of cloud services and IoT over the fog layer also leads to more increased risk of data violation. In this regard to provide or design a more secured system over legacy methods using high-end DL algorithms we can identify the attacks dynamically. With the ever-increasing Internet, society is moving towards modern technologies to predict, detect, or classify, and analysis of network behavior using DL approaches is being widely used. Hence attack detection is becoming the most recent trend and research scope for cyber threats.
Fog layer security has become challenging because of its geo-distribution and location awareness. Initially to detect attacks machine learning (ML) techniques are highly used but unsuitable for a huge volume of data. To overcome the limitation of ML, DL is used in detecting attacks in the fog layer. DL is preferable over ML for huge data as it has multiple layers in processing. DL has been utilized in classifying many attacks with a high detection rate and results in the binary classification of normal and abnormal behavior and multilabel classification and sends it to cloud for behavior update of a node [1,4,7,11,12]. Due to resource constraint nature of IoT, it is not possible to implement complex DL algorithms; therefore DL is suitable to implement on fog node/fog layer with high accuracy. Hence, DL is the best over ML algorithms against a huge volume of data. In this work, a DDoS attack is detected based on fogbased IoT applications using DLM. e proposed work is designed with a deep intelligence DDoS attack scheme (DI-ADS) based on fog and DL. By this framework, the DDoS attack is detected in less span with low latency compared to attack detection based on the cloud. To define the best model for attack detection we tested the potentiality of two DLMs and four ML models at the computational module of the fog node considering the DDoS-SDN dataset. e dataset involves three different types of DDoS attacks, namely, TCP Syn, UDP flood, and ICMP attacks. From accuracy, it is proved that the DNMLP model is the best model for attack detection and thus installed at the fog node. e communication between end IoT devices and fog is routed by gateways on which fog nodes detect attacks by classifying IoT data. e framework is implemented in six stages. e network setup is implemented in the first stage as three layers (IoT device layer, fog layer, and cloud layer) through interfaces. e network traffic classification setup aims at selecting the best model as the second stage. In the third stage, model deployment and network initialization are done by deploying the selected model on the fog node for attack detection. e fourth stage specifies how the fog node classifies the behavior of IoT devices and updates to the cloud. e cloud update is done in the fifth stage by updating the current behavior on received information from the fog. e sixth stage shows the local table update at fog for future communication and attack detection.
e major contribution of this work is stated as follows: ( e rest of the sections are discussed as follows. Section 2 presents the related works. Section 3 presents the system model. Section 4 presents the problem statement. Section 5 presents the proposed DI-ADS model. Section 6 presents the simulations and results. Section 7 presents the conclusion and future scope.

Related Works
In this section, many research works are discussed related to this concept and they come up with best Deep Learning Techniques in proposing attack detection framework for fog-based IoT system. Ahmed et al. [1] using DL technique proposed attack detection framework for several cyberattacks which results in high detection rate with 99.96% detection accuracy (DA) in binary classification and 99.65% DA in multiclassification. Lawal et al. [2] designed two modules' framework for oddity detection using signaturebased and anomaly-based methods. Module-1 is six times faster than module-2. XGBoost Classifier is used for binary and multiclass classification by module-2 to obtain accuracy of 99% and 97% for average recall, precision, and F1 score, respectively. Puthal et al. [3] discussed a 3-layered architecture and possibilities of threats and unfolding from threats at each layer and also discussed advanced research issues required for present architecture. Khater et al. [8] presented a lightweight IDS using MLP model on vector space representation. Latency on cloud, mobility support, and location awareness problems on cloud are addressed using ADFA-LD and ADFA-WD datasets which resulted in 94%, 95%, and 92% of accuracy, recall, and F1-measure, respectively. On ADFA-LD dataset, 74% of accuracy, recall, and F1-measure is found, respectively, using Raspberry Pi. Bhushan and Deepali [14] proposed a framework for defense against DDoS attack by generating TCP traffic using LOIC on Kali Linux Machine. Fog defender is used and applying rules at fog layer allowed only legitimate requests to cloud for accessing. Priyadarshini and Barik [15] proposed a novel source based DDoS defense mechanism to mitigate DDoS attacks using DLMs by blocking infected packets disseminated to cloud on CTU-13 Botnet and ISCX 2012 IDS dataset. e training and testing dataset taken is in ratio 90 : 10 with 10-fold cross validation scheme that resulted in 98.88% accuracy. Chaudhary et al. [11] surveyed various systems and explored existing things based on security, privacy, limitation, and challenges and open directions of research in the domain of computing. Douligeris and Mitrokotsa [7] presented the classification of DDoS defense system, pros and cons, and effective defense mechanism and techniques for better understanding. Potluri et al. [12] discussed DDoS attack, its detection, and prevention mechanism in cloud computing environment by using various approaches like ML, DL, NN, blockchain, SDN, and genetic algorithms. Kalaivani and Chinnadurai [16] proposed an intrusion classification model using CNN and LSTM to predict attacks accurately using DL models. e dataset used for this purpose is NSL-KDD, and it obtains 96.5% accuracy on attack detection using integrated CNN with LSTM FCID model. e model is deployed in fog layer which monitors network traffic, and it protects from malicious users on providing services to the IoT devices by cloud. It is used in multiclass attack classification such as DoS, U2R, R2L, and probe attacks. ICNN-FCID model is provided with different activation function such as ReLU, Sigmoid, and Hyperbolic Tangent (Tanh) activation function of which ReLU provided high accuracy compared to two other functions on ICNN-FCID model. Churcher et al. [17] used several ML algorithms such as KNN, SVM, DT, NB, RF, ANN, and LR for comparing both binary and multilabel classification. Considering several parameters such as accuracy, precision, recall, F1-score, and log loss, the above algorithms are compared, RF accuracy of which is 99% for HTTP DDoS attack. But, based on simulation results on abovementioned parameters RF outperforms in binary classification, and for multilabel classification KNN outperforms with accuracy of 99% compared to RF. Kilincer et al. [18] used CSE-CIC IDS-2018, UNSW-NB15, ISCX-2012, NSL-KDD, and CIDDS-001 datasets on 3 different ML algorithms such as SVM, KNN, and DT for classification of attacks by performing min-max normalization on dataset for comparative study. e result of this study shows that DT classifier is more successful that the other two classifiers. Many such related research works can also be found in [19][20][21][21][22][23][24][25][26][27][28][29][30].
From the literature survey done on various attack detection systems, the following research gaps are found; for example: (i) Most of the papers evaluated the performance accuracy on a small dataset with a smaller number of attributes that do not provide actual information on attack detection. Hence, we used DDoS-SDN dataset [13] in the present study with a greater number of attributes for better classification of attacks. (ii) e second flaw observed is the usage of classifiers for the classification of attacks made on only one classifier model. It is difficult to express which ML method performs better on the selected dataset at the highest level. Also, conventional ML algorithms are used for attack detection which is difficult when larger datasets are used. In this work, we used DLMs for performance evaluation. (iii) Many studies reported a limited number of attacks based on the dataset used. Here, we considered a dataset that can examine different DDoS attack types.

System Model
In this section, a system model is discussed with both the network model and attack model. e network model describes the network components, network topology, and communication between the network components. e attack model describes how the attackers attack the network. e notations used in this work are shown in Table 1.

Network
Model. e network model mainly consists of a 3-layered architecture such as the cloud as the upper layer, fog as the middle layer, and IoT or smart devices as the lower layer [1-3, 17, 18] as shown in Figure 1. e upper layer is called cloud layer which consists of a cloud node C which provides centralized data storage that stores the updated behaviors of the IoT devices. e cloud C also updates the IoT devices at the lower layer at regular intervals for updating the behaviors (normal/DDoS attacker) of the IoT devices. e cloud layer is connected to the middle layer through a gateway (GT) and base stations (BSs). e communication takes place using wired/wireless communications. e middle layer is called the fog layer that consists of fog nodes FN 1 , FN 2 , . . . , FN n to accomplish a considerable quantity of storage, computation, and local communication.
ese nodes give services to the IoT devices in proximity.
ey also record the behavior of the devices in a timely manner. e fog node mainly consists of a computation module (CM FN ) and a memory module (E). e CM FN of a fog node is enabled where the CM FN is trained with a DLM to perform a task to predict the behaviors of the IoT devices which communicate with the fog node in proximity. e fog  Mathematical Problems in Engineering nodes are also connected to each other through wired/ wireless communications for data communication between them. e fog layer is connected to the upper layer through GT and BSs. e communication takes place using wired/ wireless communications. e fog layer is also connected to the lower layer using GT and BSs through which communication takes place. e lower layer is called the IoT device layer that comprises IoT devices iot 1 , iot 2 , . . . , iot n which carries out a large volume of end-users data or requests to fog or cloud for fast computation and service.
e IoT devices communicate using BSs and GT for communication with the fog layer or cloud layer. ey have limited storage and computational capability.

DDoS Attack Model.
In this section, the attack model is discussed, where the attacker A i attacks the network components in any format for its control over the network or disruption of the services provided by the network components. In this model, we have assumed that the IoT devices at the lower layer communicate with the fog nodes for getting services from the cloud or fog nodes in proximity.
From Figure 2 it is seen that a DDoS attacker A controls the IoT devices in the network by taking control of the IoT devices by hacking the devices.
e IoT devices in the network are now compromised by the attacker. e attacker overwrites the particular attack code over the normal function of the IoT devices by which they behave as a malicious node as per the behavior of the attack. e compromised IoT devices communicate and attack the fog nodes in many ways such as sending unnecessary requests or signals to the fog nodes for jamming the network and for performing any malicious activity or taking control of the network [14,15,31]. In this work, we have considered three attacks of DDoS such as TCP Syn, UDP ood, and ICMP attacks by considering a new dataset [13] for training and testing.

Problem Statement
As per above model, there are n number of IoT devices I iot 1 , iot 2 , . . . , iot n which communicate with a fog node FN with communication behaviors where m is the number of communication instances made by an IoT device with a fog node FN. Each communication instance is set of attributes or features Ft ft 1 , ft 2 , . . . , ft p where p is the number of attributes or features such as source IP, destination IP, packets sent, packets received, acknowledgement, etc. with a target label attribute or feature as attack (1) or normal (0). e problem is to predict the communication instance cin m as attack (1) or normal (0) by implementing a DLM with high accuracy. Before this, the problem is also to train and test di erent DLMs with standard dataset and select the most appropriate DLM for the fog nodes.

Proposed DI-ADS Scheme
In this section, the proposed DI-ADS scheme is described to solve the above problem. e DI-ADS scheme mainly consists of six steps such as (1) Network Setup, (2) Network Tra c Classi cation Setup, (3) DLM Deployment and Network Initialization, (4) Attack Detection, (5) Cloud Update, and (6) Fog Node Update. e process ow model of DI-ADS with six steps is presented in Figure 3. e steps of the DI-ADS scheme are described as follows.

Network Setup.
In the rst step as shown in Figure 3 of the DI-ADS process ow model, the network is rst set and the components of the model are then connected. e cloud node C is rst set that provides di erent services to the users such as computing, storage, platform, networking, etc. As per DI-ADS, the cloud node stores the IoTdevice's behaviors and also updates them promptly. en the fog nodes are set in the network in such a manner that the IoT devices can communicate to get services in minimum time.
e fog nodes also solve the issues and provide services to the IoT devices in proximity. e fog nodes are connected to each in a wired/wireless manner. e IoT devices at the lower layer are connected to the fog nodes in proximity using BSs and GT. ey send and receive data wirelessly using 4 G/LTE/ 3 G/WiMAX communications. e BSs and GT also send and receive data wirelessly using 4 G/LTE/3 G/WiMAX. e placements of fog nodes and cloud nodes in any region or place are out of the scope of this work. We only focus on the connection of the layers and how communication takes between the network components.

Network Tra c Classi cation Setup.
After setting the network and connections of each layer, the fog nodes are  e models are rst trained using a standard dataset of DDoS attack detection. e dataset needs to undergo preprocessing steps before training; it goes with preprocessing steps for feature selection using handling missing values, feature scaling, one-hot encoding, and feature selection. e data preprocessing steps are shown in Figure 4 and discussed below as follows.

Data Preprocessing.
e dataset used is preprocessed as follows: (  [16,32]. Similarly, the features containing zero values can also be removed for better feature selection. Figure 4 shows the data preprocessing, training, and testing that is performed using DLM in the fog node.  dataset splitting process of the taken dataset is completely discussed in the performance evaluation section.

DLM Used for Prediction of DDoS Behavior.
After splitting the dataset, we have to train the FN i with this training dataset using DLMs. In this section, we discuss the DLMs used for the prediction of the behavior of IoT devices [1]. e models used are two DL models such as DNMLP and LSTM.
(1) DNMLP: A DNMLP model minimally consists of 3 layers, namely, an input layer, hidden layer, and output layer with an arbitrary number of hidden layers. All the neurons in this layer use a nonlinear activation function excluding the input layer. In DNMLP, data ow in the forward direction to get the data classi ed, and neurons in DNMLP are also trained with a backpropagation algorithm. In the rst step of DNMLP model the input value a i is multiplied with w i and summed up. a i w i a 1 w 1 + a 2 w 2 + · · · + a n w n . (1) In the second step, bias b of the hidden layer is added as In the third step, obtained Z value is progressed through the activation function ReLU and Softmax, generally denoted by y: where if Z < 0 the function will output zero and if Z ≥ 0 the output is simply input. en, Softmax can be de ned as In the fourth step loss (y − y) 2 is calculated and if it is higher, it should be minimized by changing w i and b which can be done by an optimizer and thus cost function is calculated as Σ i 1 n (y − y) 2 . Using this backpropagation in a certain number of iterations we arrive at global minima where we can treat this as completion of training of DNMLP.
e DNMLP architecture is shown in Figure 5.
(2) LSTM: It is explicitly designed to overcome the problem of Long-Term Dependencies of RNN also called Recurrent Neural Network (RNN) in DL [1,9,10]. It is used for classifying and making predictions on data. Every LSTM unit is composed of four things as cell state, input gate, forget gate, and output gate. It is used in language modelling, anomaly detection in network, image captioning, etc. LSTM can retain information for long run and hence used in highly classifying data. Equations involved in the progressive ow of a LSTM cell are where f t , i t , g t , cs t , o t , h t are forget gate, input gate, input node gate, cell state, output gate, and activation functions.
e LSTM architecture is shown in Figure 6.

Used Dataset.
e DLMs were evaluated on a standard new dataset to detect the di erent DDoS attacks and classify the end-user behavior (normal/attacker). e DDoS-SDN new dataset is chosen from Mendeley Data which has 104345 rows with 23 attributes [13]. Dataset is used to detect tra c type as benign or malicious based on TCP Syn attack, UDP ood attack, and the ICMP attack. A total of 23 attributes are available including Switch_id, Packet_count, byte_count, and many so with a total of 1,04,345 rows of data. e tra c classi cation is labeled as 0 for benign and 1 as a malicious user. e dataset is customized to 18 attributes of which 17 are features and 01 is the target variable. Target label binary is classi ed with 0 (normal user) and 1 (attacker). e dataset contains one categorical attribute named protocol which is one-hot encoded.

Deep Learning Model Deployment and Network
Initialization. After training the above models with the standard dataset, we need that DLM model which provides the highest prediction accuracy for predicting the behavior of the IoT device (attack or normal) with high probability. We found the accuracy of each model trained and tested and selected the model which has maximum accuracy. e selected model is implemented and deployed at the CM FN on the fog nodes in the fog layer. Afterward, as the model is deployed successfully in the fog layer the network is initialized for real-time processing where the IoT devices start communication with the fog nodes for getting services as per their requirement. Algorithm 1 shows the selection of the best DLM. Algorithm 2 shows the deployment of DLM at the fog layer and network initialization for starting the network.

Theorem 1.
e total service time for an IoT device iot i is represented as TST iot i .

Proof.
ere is an IoT device iot i and it has a nearest fog node FN i at the proximity. Firstly, iot i sends a request to the FN i at proximity. e time to send the request is calculated as follows: where t iot i −BS is the time to send the request from iot i to BS, t BS is the time to send the request from BS to the GT, and t GT−FN i is the time to send the request from GT to FN i . en, the time required by FN i for processing the request is denoted by t processing FN i as follows: where t queuing FN i is the waiting time of the request in the queue and t computation FN i is the time to process the request to nd the result. en the result is transferred to the IoTdevice iot i with a time of t FN i −iot i .
where t FN i −GT is the time to send the result from FN i to GT, t GT−BS is the time to send the result from GT to BS, and t BS−iot i is the time to send the result from BS to iot i . erefore the total service time (TST) is calculated as follows: Input Layer Hidden Layers Output Layer Figure 5: DNMLP architecture. if service needed then □

Attack Detection.
In the previous step, it is described how a DLM model is selected and installed in the fog nodes. Afterward, the IoT devices started communicating with the fog nodes in proximity for getting services. However, after the communication, the DI-ADS scheme predicts the behavior of the nodes from recorded behavior. As the CM FN i of a fog node is enabled with DLM it can predict the behavior of the IoTdevices (DDoS attack or normal). After classification, the updated behavior of the node is sent to the cloud C for storage and update. Algorithm 3 shows the classification of IoT device behavior by fog node.

Theorem 2. e behavior detection time (BDT) of an IoT
ere are n number of behaviors for n number of IoT devices in the queue. So, the BDT iot i of an IoT device iot i of a fog node FN i is calculated as where t queuing b i is the time a behavior of an IoT device waits in the queue of FN i to get processed and t prediction is the time required by FN i to detect the behavior of the IoT device. □ 5.5. Cloud Update. After the behavior of an IoTdevice is sent from a fog node FN i to cloud C the cloud node receives the behavior of the IoT device and updates the behavior of the device in the IoT device information table. is table is updated always after a response is received from any fog node FN i . Algorithm 4 shows the steps to update the behavior of IoT devices in the local memory of the cloud. 5.6. Fog Node Update. In this stage, cloud C sends the Target b i of IoT devices to FNs through communication channels GT C , GT C to BS, BS to GT i , and GT i to FN i for updating the local tables at FN nearer to BS. In future if any communication takes place between neighbouring FNs, the communication is performed only after the behavior verification from the local table information. If found to be attacker then further communication with neighbouring node in the network is stopped. Algorithm 5 shows network update at FN.

Theorem 3. e total time cloud C takes to update at FN about the attacker behavior at any time t is represented by TUD (time to update device layer about attackers).
Proof. Let, at time t, the set of predicted behaviors for n attacker devices be represented as Target b 1 , Target b 2 , . . . , Target b m for m IoT devices. is set is sent as a message M to the FNs in the whole network. For this, the cloud node C sends M to the FN i in TUD time which is calculated as follows: where t C−GT C is the time to send the message M from C to GT C , t GT C −BS is the time to send the message M from GT C to BS, t BS−GT i is the time to send the message M from BS to GT i , and t GT i −FN i is the time to send the message M from GT i to FN i .

Performance Evaluation
e performance of the proposed framework is evaluated using Python 3. e machine used for this performance evaluation has Windows 10 OS, core i7-11370 processor, 3.30 GHz processor speed, and 16 GB RAM. e DNMLP model used in the present framework is compared with the LSTM model and some conventional ML models such as SVM, KNN, LR, and RF. e performance is evaluated using the following performance parameters: (1) CA (classification accuracy): e number of predictions made correct from the observed values is called CA and it is represented below as follows: where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative. (2) F1-Score: e harmonic mean of precision and recall to know the accuracy better is shown by F1-score and it is represented below as follows: (3) Precision: In which proportion the instances are correctly classified of a particular positive class from total classified instances of that class and it is represented below as follows: (4) Recall: Recall means the proportion of actual instances correctly classified for a particular class and it is represented below as follows:

Simulation Setup for DDoS Attack Detection Using DLMs.
Firstly, using Python 3, DNMLP and LSTM DL models and RF, LR, KNN, and SVM ML models are implemented to find the best model with high accuracy. e DDoS attack SDN dataset [13] mentioned in Section 5.2.4 is used for prediction. For the implementation of DNMLP and LSTM deep learning models, the Keras package on TensorFlow is installed in Anaconda for deep learning support. e RF, LR, KNN, and SVM models are also implemented using SkLearn Package and are also used for performing preprocessing and performance evaluation metrics. Accuracy and loss performance graphs are obtained using the package matplotlib. Using DDoS-SDN dataset, we trained and evaluated DNMLP and LSTM DLMS and other ML models for binary classification (normal or attacker). DDoS-SDN dataset [13] is used to detect the capability of DNMLP, LSTM DLMs, and ML models for attack detection. e dataset contains a total of 23 features including the target label. In our work to predict the attacks, we considered 14 features, and 8 features are discarded as some of the features are having zeros and some are having no impact on the target label. e features of the list (1, 2, 3, 4, 17, 20, 21, and 22) were removed from the dataset. On removal of these features, the computational overhead is reduced, and also the model is trained on necessary data. Using the standardization technique the dataset is scaled on various features with varying magnitudes of values and partitioned into two subsets in the ratio of 80 : 20 as training and test dataset. e aim of partitioning the dataset in a ratio of 80 : 20 is for training the model with adequate information and to substantiate the created model with appropriate information.
In the proposed framework to obtain the best-trained model using DNMLP, we considered a batch size of 10 with 40 epochs on the Adam optimizer for binary classification. Keras on TensorFlow is used for constructing NN on the DNMLP model for the DDoS-SDN dataset, considering 14 Input: DLM Selected, FN 1 , FN 2 , . . . , FN n , b i Output: Update cloud C (1) forFN 1 to FN n do (2) for all IoT devices of FN i do⊳i � 1, 2, 3, . . ., n Target i � DLM Selected(b i ); ⊳ Target is the label assigned to the behavior of the IoT device.
for all FN near BS do (8) Update LocalTable(Target b i ); (9) end for (10) } (11) if communication starts between FN i and FN j then (12) ifFN j � � DDoS attacker ⊳FN i checks in local table of itself. then (13) No communication; (14) else (15) Communication occurs; (16) end if (17)  input values and 1 output value. One categorical attribute, namely, protocol, is one-hot encoded. In this model, we used two hidden layers with a dimension of 16. e model used to create NN using Keras is sequential which takes the output of each layer as input to the next layer using the add-on model. To specify a fully connected layer we used Dense from the Keras package on 16 input dimensions to generate 16 output dimensions on the ReLu activation function by passing as an argument to add function. In the last layer, the output dimension is 1 to obtain binary classi cation by using the sigmoid as an activation function.
In LSTM, we used a learning rate of 0.001 for batch size 64 with 40 epochs on the Adam optimizer algorithm that has an input layer with 16 input dimensions and 16 as output dimension space, two hidden layers with 16 input dimensions and 16 as output dimension, and one output layer using the sigmoid activation function. From [1], it is observed that, with an increase in the number of hidden layers on a batch size of 128 with 100 epochs on di erent learning rates of 0.01, and 0.001 LSTM obtains higher accuracy.
Sklearn package is the most useful library for ML in Python. It is used to model the data on both supervised and unsupervised ML algorithms. Here we considered supervised ML algorithms such as RF, LR, KNN, and SVM. For building an RF ensemble classi er we used sklearn.ensemble.RandomForestClassi er which contains parameters of max_depth and random_state. e max_depth as 5 represents a tree with maximum depth from root to leaf being 5 and parameter value of random_state can be either int or none which we considered as 0 and also by default n_estimators are 100 which contain 100 decision trees. LR is also used for binary classi cation which is imported from sklearn.Linear_model.LogisticRegression. By default, lbfgs solver in LR is used in the optimization problem. To implement SVM the module from sklearn.svm.LinearSVC is imported. e LinearSVC is highly used on larger datasets and by default the RBF kernel is used in implementing the algorithm. In KNN the algorithm is imported from sklearn.neighbors.KNeighborClassi er with a parameter n_neighbor of either int or default which takes a value of 5 and represents the number of neighbours. In this work based on RMSE, we found K 1 which obtain high accuracy among the other K values.

Results and Discussion.
e performance of DNMLP for the DDoS-SDN dataset [13] in training and validation accuracy is shown in Figure 7. Binary classi cation with a batch size of 10 and an increase in epochs shows an increase in training and validation accuracy in Figure 7, but the best accuracy is obtained at the 34th epoch as 99.55% from experiment evaluation. From Figure 8, the training and validation loss decrease converged at the 40th epoch. With an increase in batch size to 64 at the 40th epoch, the model witnessed a slight decrease in all performance metrics but by the increase in epochs, the increase in validation accuracy and decrease in validation loss were observed with good t learning curves by overcoming under tting and over tting. e training and validation accuracy of the DDoS-SDN dataset using LSTM is shown in Figure 9. Binary classication with a batch size of 64 and an increase in epochs shows an increase in training and validation accuracy in Figure 9, but the best accuracy is obtained at 28th, 33rd, and 39th epochs as from experiment evaluation. From Figure 10, the training and validation loss decrease converged after the 40th epoch of 64 batch size. Figure 11 shows the comparison of performance metrics among the two DLMs and four ML models on the DDoS-SDN dataset. In terms of accuracy, precision, recall, and F1score, DNMLP performs better than considered ML models and LSTM. e accuracies of all DNMLP, LSTM, and other ML models with binary classi cation are shown in Figure 12 such that the DNMLP model is predicted to be a higher ranking than all other models.
Using the DDoS-SDN dataset we trained and evaluated DNMLP, LSTM, and ML models for binary classi cation and found DNMLP shows better accuracy than all other models in predicting the behavior of the IoT devices as an attacker or normal. Accuracy is only a basic measure in evaluating a model but proved to be good when a balanced dataset is used. In this work, the dataset used is imbalanced as the number of normal users is 60.92% and malicious users are 39.08%; also we used an 80 : 20 train and test split. So, there may be a chance of larger false positives (FP) than false negatives (FN). In such cases, it is better to account for the other performance metrics like precision, recall, and F1-measure.
e recall does only consider false negatives and true positives (TP) and thus recall may be high. Precision does only consider FP and TP;        it may su er from low value. Here F1-score, a harmonic mean of precision and recall, will have its importance in deciding the performance of the model and it is evident by the results showing the highest value of the F1-score (99.30%) with DNMLP.
To test the scalability issue, the simulation environment is set as a three-tier architecture where 1 cloud server is connected to multiple fog nodes. In the last layer, we assume that there are 10-100 IoT devices connected to the nearest fog nodes. For example, if 1 fog node is there and 10 IoT devices are there, then 10 IoT devices directly connect to the fog node; however, if the number of fog nodes is greater than 1 then the number of fog nodes equally divides the number of IoT devices for providing required service. So, if there are 2 fog nodes then 1 fog node will give service to 5 IoT devices. In this scenario, we assume that an IoTdevice communicates with the fog node and 1 sample is generated (row). is sample is then processed at the fog node for behavior prediction (normal/attack). e average behavior detection time from the above experiment using DNMLP is found to be 0.0004 seconds for 10 IoT devices/samples and for LSTM it shows an average of 0.001 seconds for 10 IoT devices/ samples. From this experiment, we have tested if the number of IoT devices increases concerning the number of fog nodes then what the impact on behavior detection time is. Behavior detection time (BDT) is the time to detect the number of IoT devices as normal or attacker by the fog nodes using DNMLP or LSTM. Table 2 shows the parameters and values for the network simulation.
From Figures 13-17, it is observed that when the number of IoT devices increases in the network then the behavior detection time also increases for all the IoTdevices. Figure 13 shows the result when the number of fog nodes in the network is 1 and the IoT devices increase from 10 to 100. From this gure, it is observed that DNMLP shows less behavior detection time than LSTM. e average behavior detection time of 100 IoT devices for DNMLP is found to be 0.022 secs and the average behavior detection time of LSTM is found to be 0.055 secs. Figure 14 shows the result when the number of fog nodes in the network is 3 and the IoT devices increase from 10 to 100. From this gure, it is observed that DNMLP shows less behavior detection time than LSTM. e average behavior detection time of 100 IoT devices for DNMLP is found to be 0.0073 secs and the average behavior detection time of LSTM is found to be 0.018 secs. Figure 15 shows the result when the number of fog nodes in the network is 5 and the IoT devices increase from 10 to 100. From this gure, it is observed that DNMLP shows less behavior detection time than LSTM. e average behavior detection time of 100 IoT devices for DNMLP is found to be 0.0044 secs and the average behavior detection time of LSTM is found to be 0.011 secs. Figure 16 shows the result when the number of fog nodes in the network is 7 and the IoT devices increases from 10 to 100. From this gure, it is observed that DNMLP shows less behavior detection time than LSTM. e average behavior detection time of 100 IoT devices for DNMLP is found to be 0.003142 secs and the average behavior detection time of LSTM is found to be 0.007857 secs. Figure 17 shows the result when the number of fog nodes in     Mathematical Problems in Engineering the network is 9 and the IoT devices increase from 10 to 100. From this gure, it is observed that DNMLP shows less behavior detection time than LSTM. e average behavior detection time of 100 IoT devices for DNMLP is found to be 0.0024 secs and the average behavior detection time of LSTM is found to be 0.0061 secs. From all these results it also concluded that when the fog nodes in the network increase the behavior detection time reduces.

Conclusion
In this paper, we designed a DI-ADS scheme to detect DDoS attacks for fog-based IoT systems using DLM. Firstly, the network is set with three layers: IoT device layer, fog layer, and cloud layer, and then DNMLP, LSTM DL models, RF, LR, KNN, and SVM ML models are evaluated to predict the best model with high accuracy to be deployed at the fog nodes. e result shows 99.44% accuracy using the DNMLP model and hence the fog layer in the network is deployed with DNMLP where each fog node is enabled with DNMLP. It performs binary classi cation into two classes 1 and 0 as attacker and normal devices, respectively, and sends the device behavior to the cloud for an update. en, the cloud sends the attacker information to the IoT device layer where each device knows about the attacker device status in the neighbourhood. Further communication with these attacker nodes is decided by the individual IoT devices by verifying the current behavior status. is model will be a better scheme for securing the fog layer from DDoS attacks. In future, we will implement the same scheme for attack detection using new DLMs, hybrid models, and unsupervised learning like Deep Belief Networks (DBNs) by training the fog nodes with an increased size of the dataset and also using newer datasets along with multiclass classi cation can be performed to detect particular attacks.
In the current work, the DNMLP model shows better accuracy compared to other considered models which have implementation level limitations in obtaining the performance metrics. In the time ahead it would be apposite to train the DL models by altering the batch size and learning rates, and increasing epoch number helps in achieving a better performance benchmark. It is also possible to get better results by re ning the process of data preprocessing. Beyond this, we could take on heuristic algorithms for optimizing the DL models.

Data Availability
Data will be available on request basis and will be provided by rst author of this manuscript.

Consent
Not Applicable.

Conflicts of Interest
ere are no con icts of interest.