Artificial Intelligence-Based Security Protocols to Resist Attacks in Internet of Things

IoT (Internet of Things) usage in industrial and scienti ﬁ c domains is progressively increasing. Currently, IoTs are utilized in numerous applications in di ﬀ erent domains, similar to communication technology, environmental monitoring, agriculture, medical services, and manufacturing purposes. But, the IoT systems are vulnerable against various intrusions and attacks in the perspective on the security view. It is essential to create an intrusion detection model to detect and secure the network from di ﬀ erent attacks and anomalies that continually happen in the network. In this paper, the anomaly detection model for an IoT network using deep neural networks (DNN) with chicken swarm optimization (CSO) algorithm was proposed. Presently, the DNN has demonstrated its e ﬃ ciency in di ﬀ erent ﬁ elds that are applicable to its usage. Deep learning is the type of algorithm based on machine learning which used many layers to gradually extricate more signi ﬁ cant features of level from the raw inputs. The UNSW-NB15 dataset was utilized to evaluate the anomaly detection model. The proposed model obtained 94.85% accuracy and 96.53% detection rate which is better than other compared techniques like GA-NB, GSO, and PSO for validation. The DNN-CSO model has performed well in detecting most of the attacks, and it is appropriate for detecting anomalies in the IoT network.

change in the condition of a system beyond its global or local norm. This description contains a number of significant observations about the existence of IoT data: (i) Most of the data collected by an IoT system could be taken as "normal" since it reflects the typical operating qualities for that particular system (ii) The definition of a system's "normal" operation can change for a number of reasons over time (iii) The data produced by an IoT deployment shows only the actual processes that control the monitoring system [4] In Figure 1, IoT networks consist of less cost sensors that were placed in three types of formats over a wide region, (1) centralized networks consisting of several, (2) decentralized networks, and (3) block-chain technology-based distributed networks. The sensors in these IoT networks perform the important roles in assuring the total efficiency of the IoT network [5].
There are instances in real-world datasets that are different from every other instance and called as anomalies. The identification of anomaly was to identify certain standards whose activity was deemed as abnormally correlated to normal nodes. The data leakage, fraud detection, and intrusion detection system are separate causes of anomalies. Detection of anomalies is used in a number of IoT domain regions, as presented in Table 1 [6][7][8].
1.1. Intrusion Detection. IoT devices are linked to the Internet and remain susceptible to attacks related to security. Incidents such as Denial-of-Service (DoS) and distributed DoS (DDoS) attacks create significant damage to the network. The major problem in IoT applications is identification and protection from such attacks that are mentioned in Table 1.

Fraud Detection.
IoT networks are still vulnerable during logins or online purchases which can result in credit card details, bank data, or various sensitive details' theft.
1.3. Data Leakage. Sensitive data from file servers, databases, and various sources of data could leak to any external agency that not only contributes to data loss but further generates a threat which could compromise confidential system data. Suitable mechanisms of encryption will avoid such leaks.
Anomalies may be identified based on the point-wise, collective, or contextual forms. Point-wise anomalies tend to identify points that essentially deviate from the remaining data points and are utilized when series evolutions are not linear. Typically, it was utilized for detecting fraud.
Typical patterns of the time series like repeated pattern or forms from several IoT devices were identified collective anomalies. Shipping delay in the supply chain is very normal but if there are multiple delays, then it may take investigation and also collective study. Contextual anomalies are observed by taking into account the preceding type of information or context, like day of the week. Contexts are always very unique to a particular domain [9].
In Figure 2, the first process is to understand the type of the dataset collected. The next process is to distinguish the type of anomaly (i.e., point, contextual, and collective anomalies) from a predefined collection. The last process was to understand the training data availability for developing the anomaly detection model [10]. The novel contributions of this paper are structured as follows: (i) Presented the anomaly detection model for security attack detection by means of DNN with the CSO algorithms. In this work, the optimization algorithm is proposed for optimizing the performance of the CSO algorithm (ii) Deep learning is the class of machine learning algorithms which gradually extracted high-level feature from raw inputs using many layers. The UNSW-NB15 dataset was utilized for assessment of the anomaly detection model. This introduction part discusses the anomaly detection process in IoT and the concept of the proposed model The remaining sections will be as follows: Section 2 discusses the relevant works on IoT anomaly detection, Section 3 discusses the proposed methodology, Section 4 presents the performance analysis of the proposed model, and Section 5 represents the conclusion of the work.

Related Works
Bagaa et al. proposed a security system for IoT based on machine learning model. This system leverages both Network Function Virtualization (NFV) and Software-Defined Networking (SDN) enablers for reducing various threats. This security system copes automatically with the expanding aspects of security associated with IoT domain. The system used the distributed data mining system, supervised learning, and neural network for developing this intrusion detection model. The NSL-KDD dataset used for evaluation and one class SVM technique was used to detect the attacks and obtained better detection accuracy. Overall, the performance was good and the results obtained were appropriate for this intrusion detection model [1].
Lawal et al. used different classification techniques like k-NN, J-48, and Naïve Bayes for classifying different attacks in the IoT intrusion detection model. For training and testing, the UNSW-NB15 dataset was utilized. Performance analysis of J48, k-NN, and NB classifiers utilizing the WEKA application was experimented on this dataset. Outcomes from the analysis demonstrated that k-NN achieved better accuracy and low FP rate in detecting abnormal and normal traffics, where J48 performed better in classification than NB and k-NN based on the attack classes [2].
Hoang and Nguyen proposed an anomaly detection model for IoT network traffic using PCA method. The PCA method was used for reducing higher data dimension. A new distance formula was proposed and implemented to derive formulas from past works. Based on those derivations, a new technique for anomaly detection in network traffic 2 Wireless Communications and Mobile Computing was implemented and obtained appropriate results using new distance formula by reducing the computational overhead [3]. Sharmat et al. developed an anomaly detection model for IoT network using machine learning method. Artificial neural network and logistic regression techniques were used for classification. The Kaggle dataset was used for performance evaluation in this work. It was concluded that ANN was better than LR in case 1, and both have performed similar in case 2 [11].
Fahim and Sillitti proposed a hybrid learning anomaly detection using clustering and classification techniques. For clustering, Hierarchical Affinity Propagation (HAP) was used, and for classification, decision tree classifier CART technique was used. The model combines the data into anomaly and normal clusters by using HAP clustering. Then, the labeled data acquired from the clustering stage was used for training the CART and for classifying future unseen data. The model was able to automate the data labeling, which was an advantage to reduce human intervention [12].
Deep learning methods have been utilized by some researchers to detect network anomalies. The classification results and deep learning methods were compared in the study of [13], and the findings show that the deep learning technique performed better. However, they only looked at the categorization study on PortScan and regular network traffic. The actual network environment has many more network traffic kinds than two, making identification more challenging.
The signature-based techniques have a high detection accuracy and a fast detection speed; they are ineffective for detecting unknown network traffic. In comparison, anomalybased methods are more adaptable and generalizable, and they

Proposed Methodology
The anomaly detection model is proposed for an IoT network that uses a DNN with CSO. DNN has currently demonstrated its effectiveness in numerous fields that are important to its implementation. In Figure 3, deep learning is the algorithm which gradually extricates high-level features using multilayers from the raw input. For data collection, the UNSW-NB15 dataset was utilized for evaluating the proposed model. The integration of homogenous neural network classifiers results in a hybrid deep neural network-CSO model. The aggregation of classifiers is created by changing the activation of the neural network's weights and varying the input features.

Deep Neural Network.
In this multilayer feed-forward DNN, the backpropagation technique is used. The backpropagation technique used supervised learning, while the approach was presented with input and output to be computed by the network and hence, the error is computed. The training started with random weight, and the purpose was to change them to minimize the errors. A neuron's weighted sum is calculated as where input sum U j was multiplied by its relative weights, V ij . The activation is just based on the weights and the inputs. If the identity will be the output function, hence, the neuron will be considered as linear. The used output function was sigmoid.
The error is weight dependent and recommended for modifying to reduce the errors. The error functions for each neuron's outputs could be set to The result would be positive, and required targets would be bigger while the differences were bigger and smaller if the differences were smaller. The network errors would be simply a sum of all neuron errors in the output layer: where R i and d i were the target output; the weight modified using the gradient descent method after finding this is the equation as follows: This equation can be interpreted as follows: the change of each weight was a constant negative eta (η); thus, learning rate wasη, multiplied by previous weight dependency on network error, which was a derivative ofFin relation tov ij .
The size of the correction would depend on η and the weight contribution to the function's error. That is, if the weight provides a great deal to the error, the correction is higher than it provides to the lower amounts. Equation (5) was utilized with a minimalized error before sufficient weights are established.
From now on, the F derivative was discovered in respect of v ij . This is the objective of the backpropagation algorithm since it is important to achieve backward. Firstly, calculate the errors according to the outputs, with the derivative ofF from Equations (3) and (4) in relation toR i .
According to activations, the output depended on weights from Equations (1) and (2), respectively. That could be noted from Equations (6) and (7):

Wireless Communications and Mobile Computing
The adjustment will begin from Equations (5) and (8) for each weight: In Equation (9), in order to train the networks with an additional layer, some factors were required specifically on the training period that may be impacted with network architectures [13].
3.2. Chicken Swarm Optimization. CSO was an algorithm of bioinspired optimization. In the chicken swarm, it imitates the hierarchical orders and the chicken swarm behaviors. The chicken swarm could be categorized as several groups, containing a rooster and various chicks and hens. Various chickens followed various laws of movement. Under a particular hierarchical order, there are competitions between different chickens. Activities of chickens are by the values that follow the principles.
(1) Several groups are present in the chicken swarm. All groups have a predominant rooster, a few hens, and chickens (2) How the chicken swarm can be divided into several classes and identification of chickens according to fitness value of chicken itself. The chicken with a higher fitness value will be carried out as rooster; each of that would be the group's head chicken. The chicken with low fitness value will be marked as chicks. The remainder is to be the hens. The hens choose randomly the party they want to live in. The mother-child link among the chickens and hens will be settled randomly (3) The hierarchical structure, the close bond, and the bond between the mother and child within the group will remain constant. These conditions update many (G) timely steps (4) Chicken tracks the rooster of their groups' mate to look for foods, although they may avoid eating their own food. Consider chickens poaching the best food found by others, accidentally. The chicks search around their mother (a hen) for food. A strong individual has an upper hand in a food competition Chickens and chicken activities with the better fitness value may look for food across a wide range of distances. The chicken's movement ability is given in the following condition: where Randnð0, σ 2 Þ was the Gaussian distribution with mean zero and standard deviations; σ 2 was utilized to prevent zero-division-errors. Kis the index of rooster which was selected at random from the rooster groups, and f was the fitness values of related A. This phenomenon is formulated according to the following:

Wireless Communications and Mobile Computing
The greater the difference between the fitness values of the two chicken, the lesser the S2 and the greater the distance between the positions of the two chickens. So the hens will not eat the food provided by other chickens quickly. The formula structure of S1 was different from S2 where there are competitions in a group. The chicks travel to search for food around their mother' it is expressed as where A c+1 i,j represents the location of the i th chick's mom ð m ∈ ½1, NÞ. FL (FL ∈ ð0, 2Þ) was the parameter, meaning the chicks will follow his mom to search foods. The differences were treated individually; the FL of every chick could select at random among zero and two [14].
The mathematical model of CSO could be comprehended in an accompanying manner: initially, verify the group structure, in particular the total of roosters, hens, chicks, and the mother hens; then, set determined identities for every chick; thirdly, set up the mathematical model by the identities of the chickens and their foraging laws; and finally, set a specific interval to update the relationship of chickens frequently. In the group, the number of roosters and chicks is smaller than that of hens, and their structures are generally simple. The number of hens is the largest, and the hens' structure is the most difficult in the group. In this way, the hen model will directly impact the performance of the CSO [15].

Performance Analysis
Performance analysis and implementation of the proposed model are performed on a computer with Core i5 3.20 GHz CPU and 4 GB RAM in MATLAB 2017a. The proposed approach would be assessed using the output parameters such as accuracy, recall, precisions, F1-score, and detection rates [16][17][18][19][20]. The analysis of the performance of Initialize repeat Employ and order the fitness values of chicken using Equations ((10)) and ( (11)) Isolate groups and select relations among chickens and hen using Equations ((12)), ((13)), and ( (14)) Updating the chicken's solution till chicken's swarm find the better solutions using Equation ( (15) Wireless Communications and Mobile Computing the proposed DNN-CSO approach will be compared with the other techniques such as GA-NB, GSO, and PSO.

Description of Dataset.
The IXIA PerfectStorm application creates the raw network packet of the UNSW-NB15 dataset in the Cyber Range Labs of Australian Centre for Cyber Security (ACCS) to create the integration of true modern general operation and synthetic modern attack behaviors. Tcp_dump application was utilized to collect raw traffics over 100 GB (i.e., Pcap file). This dataset included nine attack types like Backdoor, Analysis, Exploits, Fuzzer, Shellcodes, DoS, Generics, Worm, and Reconnaissance. Bro-IDS and Argus were utilized, and 12 approaches were generated for producing 49 attributes overall [21]. The dataset was accessible from https://www.unsw.adfa.edu.au/ unsw-canberra-cyber/cyb-ersecurity/ADFA-NB15-Datasets/. For training and testing, the dataset is divided into 70% for training and 30% for testing. The training sets contain 175341 instances, and testing sets include 82332 instances from various attack types and normal. In this analysis, just 12 attributes were chosen for performing the analysis from 49 attributes. The attributes chosen were cts-srv-dsst, scrips, cts-dsst-ltsm, cts-ssrc-dsport-ltsm, cts-ssrc-ltsm, dur, ctsdsst-ssrc-ltsm, dssport, dsbytes, dsstip, protos, and iss-ftpslogins as seen in Table 2. The traffic distributions of the dataset are represented in Table 3.

Performance Metrics.
The accuracy was simply a subset of the model's performances. It is one of the performance indicators used to assess classification approaches. The following expression was used to compute the accuracy:

Accuracy = TPV + TNV TPV + TNV + FPV + FNV
: Precision was defined as the positive prediction rates. It was described as proportions of correctly predicted positive The recall was also known as the sensitivity. It was the ratio of each observation in the actual classes to the correctly predicted positive values. The following equation was used to compute recall: The detection rate was the measure of the numbers of intrusion incidents. It reflects the total number of appropriate positive class predictions produced as the percentage of all predictions made. The DR was calculated by using F1-score was the harmonic mean estimation of precision and recall. This metric, which was connected to accuracy, was ideal for measuring the performance detection of unbalanced data.
The attack detected performance was assessed using the proposed approach and correlated with the various existing approaches like Genetic Algorithm with Naïve Bayes (GA-NB), Glowworm Swarm Optimization (GSO), and Particle Swarm Optimization as seen in Table 4. Ten types of attacks comprising normal attack labels were utilized for these performances of attack identification. The proposed approaches detected every attacking labelled with higher detection rates. The least performance model is GA-NB, and GSO and PSO were close and equivalent in the performances shown in Figure 4.
According to these characteristics, the proposed approach's assessment was based on the identification of attacks in the input dataset. Accuracy was the appropriate detection range with each instance; detection rate was the detection ratio of classifier attacks; F1-score described the estimate of unbalanced samples; and recall reflected how many attacks the system returned. Precision referred to how many of the returning attacks were right. To validate the proposed DNN-CSO approach, the performance of several outcome parameters was assessed, as seen in Table 5.
The proposed method's performance was assessed by accuracy, detection rates or recall, precision, and F1-scores. As shown in Figure 5, the comparison of every performance    The DNN-CSO technique outperformed all other assessment criteria, comprising accuracy and detection rates. The DNN-CSO attained an accuracy of 94.85 percent, which was 5.6 percent to 12.5 percent greater than the other evaluated approaches. The proposed approach achieves a detection rate of 96.53 percent, which was 1.5 percent to 5.13 percent greater than other compared approaches.

Conclusion
Anomaly detection in IoT networks using deep neural networks with chicken swarm optimization algorithm was proposed. The DNN technique was used for feature selection and extraction of the dataset. The UNSW-NB15 dataset was used for generating the combinations of actual modern normal performances and synthetic modern attack behaviors in this model. Out of 49 features from the dataset, only 12 features were selected for the performance evaluation. Ten types of attacks comprising normal attack labels were utilized for these performances of attack identification. The proposed approach detected every attack label with higher detection rates compared with other techniques. The features of the dataset are effectively extracted by the DNN, and the CSO was used to classify and detect the attacks. For performance analysis, various parameters like accuracy, recall, precision, detection rates, and F1-score were evaluated. The DNN-CSO approach obtained the best performances in every evaluation term comprising detection rate and accuracy. DNN-CSO obtained 94.85% accuracy which was 5.6% to 12.5% improved than various compared approaches. The detection rates obtained by the presented approach was 96.53%, which was 1.5% to 5.13% greater than compared approaches. In the future, the proposed anomaly detection model can be used for detecting various attacks using different datasets for different network platforms like WSN, Cloud, and ad hoc networks.

Data Availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.