Machine Learning Methods for Intrusive Detection of Wormhole Attack in Mobile Ad-Hoc Network (MANET)

A wormhole attack is a type of attack on the network layer that reflects routing protocols. The classification is performed with several methods of machine learning consisting of K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Linear Discrimination Analysis (LDA), Naive Bayes (NB), and Convolutional neural network (CNN). Moreover, for feature extraction, we used nodes' properties, especially nodes' speed, in the MANET. We have collected 3997 distinct (normal 3781 and malicious 216) samples that comprise normal and malicious models. The classification results show that the accuracy of KNN, SVM, DT, LDA, NB, and CNN methods is 97.1%, 98.2%, 98.9%, 95.2%, 94.7%, and 96.4%, respectively. Based on our findings, the DT method's accuracy is 98.9% and higher than other ways. In the next priority, SVM, KNN, CNN, LDA, and NB indicate high accuracy, respectively.


Introduction
A MANET (Mobile Adhoc Network) is a series of wirelessly interconnected, self-arranged nodes. Each mobile ad hoc network node functions as a router to transmit the packet to the destination node from the source node. Remote ad hoc networks are enormous and commonly used networks. Each movable node is a node that is self-managed, and there is no central mobile network management node. Based on their need, the movable nodes have permission to go somewhere. It makes it possible for the nodes to join or exit the network [1] quickly. There is no restriction to the capacity of nodes for communication. If the relationship is formed and the nodes are outside the network radio range, data loss can be incurred. MANET is commonly used in numerous fields, such as science, rescue operations, military, etc. Cyber-attacks are also growing due to improved connectivity across networks [2]. Because of shared channel illumination, unconfident operating environment, restricted resource mobility, rapidly evolving device topology, resource-limited [3], Ad-hoc wireless mobile networks are susceptible to many security threats.
Detection based on irregularities accepts interference based on a system's everyday actions.
The method of enumerating standard system output is demanding because system activity varies from time to time [4]. The anomaly procedure figures out fresh or unexplained attacks with high false-positive rates. Signature-based IDS is characterized by searching for unique patterns such as byte sequences in network traffic as an attack detection method [5]. It merely recognizes proven attacks and fails to recognize new attacks for which there is no trend. In MANET, safe connectivity is a challenging challenge due to the lack of fixed infrastructure, complex topology, etc. Detection of intrusion is a notion that holds up the balance by methods of cryptography and access management. It is displayed to resolve the attack that has happened or is in progress as automatic detection and root of warning. In various variants of intrusion detection systems such as Host Intrusion Detection Systems (HIDS), Application-based IDS, and Network Intrusion Detection Systems, the notion of ID is stored (NIDS). Since they are passive, the IDS does not take protective action, and they only discover intrusion that triggers an alarm [6].
A wormhole attack is a type of attack on the network layer that reflects routing protocols.
Two or more malicious nodes detect a wormhole threat using a private channel named the tunnel.
The Wormhole tunnel would then continue to capture and relay the same data packets to some other location. A malicious node receives a control packet on one side of the tunnel. It transfers through a private channel to another interesting node at the other end, which rebroadcasts the packet locally. The path for communication between source and target is preferred via the private channel due to better prediction, e.g., fewer hops or less time, relative to packets exchanged through other routes [7]. One component that was developed in the late 1950s by Artificial Intelligence was ML. Over time, it has developed and evolved into algorithms that could be machine-based and efficient enough in medical, engineering, and computer sciences to solve different concerns, such as sorting, clustering, regression, and optimization. ML is one of the most common technologies of today. ML helps computer systems to learn dynamically without human participation and take action accordingly. It builds a model by automatically, effectively, and correctly manipulating complex data. To have a general approach to improving device performance, ML can benefit from a generalized structure. It has many applications in scientific fields such as manual information entry, automatic spam detection, medical diagnostics, image recognition, data clearing, noise-reduction [8], [9], etc. The latest findings indicate that in WSNs, ML has been implemented to address several problems. Using ML in WSNs not only increases the efficacy of the system but also prevents complex problems, such as reprogramming, manually accessing vast volumes of data, and extracting valuable data from data. In gathering vast quantities of data and producing useful data, ML methods are often beneficial [10]. The fundamental purpose of this thesis is to suggest the technique of detecting a wormhole threat base on machine learning methods.

Literature review
Wireless networks are very vulnerable to threats, and the lines of communication are open to hackers. In MANETs, the monitoring of attackers can be accomplished by program modules that track malicious network operations automatically. We ought to consider specific thoughts when developing an intruder identification method for MANETs [11]. For MANETs, the intruder detection systems will act separately from their wired counterparts. When developing intruder detection systems for MANETs, some problems need to be tackled. The non-collaborative intruder monitoring systems deploy node-level agents to track and record any unusual activities [12]. In determining the position of agents when the nodes are mobile, the most significant challenge lies.
Similarly, the nodes hosting the intruder detection agents require higher bandwidth, battery capacity, and processing power. In MANETs [13], however, these services are restricted. An NPcomplete challenge is increasing the attacker detection rate with minimal resources, and multiple writers have suggested algorithms to provide the closest solutions. For MANETS [14], there are many intruder detection architectures available. As in wired networks, a wide variety of attacks can occur, some of which in MANETs are more destructive. The standard techniques for detecting attack traffic are inadequate due to the features of these networks. Intrusion Detection Systems (IDSs) are based on various detection techniques, but anomalies' detection is one of the most important. Besides, if these IDSs are centralized, IDSs based on previous attack signatures are less effective. Peterson et al. [15] based on adding to the detection engine a recent Machine Learning technique that identifies attack traffic online (not to be processed and evaluated after), rewriting IDS rules on the fly [15]. A two-level monitoring method for detecting malicious nodes in MANETs is being proposed by Amouri et al. dedicated sniffers operating in promiscuous mode are installed at the first stage. Each sniffer uses a decision-tree-based classifier that produces quantities that we apply to every reporting time correctly categorized instances. In the second step, the classified instances were transmitted to the algorithmically operated supernode.
It determines the amounts related to the cumulative fluctuation measure of the classified instances obtained for each node being evaluated. The outcome approach has also been extended to wireless sensor networks and is a feasible IDS scheme for those networks [16]. Abd-El-Azim et al. suggested MANET's streamlined fuzzy-based intrusion detection method with an automation mechanism employing an Adaptive Neuro-Fuzzy Inference System to generate a fuzzy system (ANFIS). The next move was to configure the FIS and then use the Genetic Algorithm to optimize this initialized framework (GA). The network increased with an average of 36 percent in the existence of only black-Hole attacks [17].
The Intrusion Detection Device for the Jamming attack was suggested by Soni & Sudhakar.
The jamming attacker slowly inserted the packets into the network and, depending on the time example, the number of these packets is quickly improved. Its unwelcome flooding actions recognize the IDS as the attacker nodes, and the attacker's infection is detected. The suggested scheme continuously tracked all nodes' actions in the network, and the malicious node's behaviors were different from normal nodes and did not behave like a regular node [18]. In the presence of the reputed packet dropping nodes in a MANET network, Sultana  algorithms of the reputed packet falling nodes is a significant problem and harms network performance [19] (see Table 1). The findings reveal that the negligence by IDS algorithms of the reputed packet falling nodes is a significant issue and hurts network efficiency

Wormhole attack
One of MANET's most significant security attacks is the wormhole threat. More MANET routing protocols (DSR), AODV, OLSR, DSDV, etc. can be damaged. A wormhole attack is detected by at least two malicious nodes using a private channel called a tunnel. At this stage, the Wormhole tunnel will then start to collect the data packets and pass them to some other location [31]. A malicious node receives a control packet on one side of the tunnel. It transfers to another interesting node via a private channel at the other end, which retransmits the packet locally. The path for communication between source and destination is chosen via the private channel due to improved metrics, such as fewer hops or less time than packets sent over other routes usually.
Typically, the assault operates in two steps. The wormhole nodes are interested in several paths in the first step. In the second point, the packets start using these malicious nodes. These nodes can complicate the functionality of the network in a variety of ways [32].  Figure 1.

Support Vector Machine (SVM)
SVM is a supervised technical group of ML that best classifies each observation from a given data set using a hyperplane. SVM can deal with both linear and non-linear questions and is more useful in large datasets. To address different problems such as routing [33], localization [34], fault detection [35], congestion control [36], and communication issues [37], SVM is added to WSNs.

K-Nearest neighbor (k-NN)
The most popular example-based approach used to solve regression and classification problems is the K-Nearest neighbor (k-NN). The distance between the sample given and the sample being measured is mainly defined by k-NN. The different distances are known in k-NN, such as the Hamming distance, Euclidean distance, Manhattan distance, and Chebyshev distance function.
The missing samples from the featured room are detected by this method, and the measurements are reduced. K-NN was introduced in WSN applications by data aggregation [38] and anomaly detection [39].

Deep Learning
DL is a type of machine learning that belongs to the ANN family with a multilayer understanding. It imitates the human brain's communication and information processing mechanisms and procedures the data for object identification, language translation, speech recognition, and decision making. In WSNs, DL is used to tackle many problems, such as abnormality and fault detection, energy harvesting, data efficiency calculation, and routing [40].
In the design of data safety, classification, and prediction activities, the security applications of deep learning models such as Intrusion Detection systems (IDS), malware detection, and spam filtering have become important. Based on intelligence, these various activities are structured to construct a paradigm that generally classifies and discriminates between "normal" and "malicious" samples, such as attacks and standard packets. With the exponential growth in the use of Deep Learning Models [41], the sophistication of attack strategy tools is enhanced.

Naïve Bayesian learning
Bayesian learning is a mathematical learning technique that, by learning conditional independence from various statistical approaches, seeks the connection between the datasets. In order to evaluate posterior likelihoods, Bayesian learning takes various previous probability functions and new knowledge. If Y1, Y2, Y3...Yn represents a series of inputs and returns a mark θ; the likelihood of p(θ) must be amplified. Bayesian learning approaches have resolved many problems in WSNs, such as routing [42] data location [43], aggregation [44], fault prediction, connectivity, and coverage problems [45].

Decision Trees (DT)
DT is similar to supervised learning ML algorithms that use arrays of if and then other rules to improve readability. There are two kinds of trees in DT. The leaf node is one, and the decision nodes are another. Based on the judgment rules, DT forecasts a class or goal and generates a training model derived from training results. Decision trees offer many advantages, such as transparency, less complexity, and rigorous decision-making analysis. Decision trees are used to resolve different WSN problems, including connectivity [46], data aggregation [47], mobile devices, etc.

Convolutional neural network
CNN's have been widely used for DL and the most prominent classes of neural networks, mostly in extensive data such as images and videos. It is a multilayer Neural Network architecture caused by cortex neurobiology. It consists of convolutional layers and fully connected layers. Between these two layers, subsampling layers can exist. They achieve the best of DNNs that have complexity in well scaling along with multidimensional locally correlated input data. The immediate implementation of CNN, therefore, takes place in databases where relatively large numbers of nodes and parameters require to be trained (e.g., image processing).

Proposed process
Our method is useful in the identification of malicious material. In an ad hoc network of natural and malicious output file monitoring nodes, this wormhole attack mitigation is introduced. Initially, with their procedures, we describe the sum of normal nodes and malignant nodes. In this scheme, a tunnel between the malicious nodes and the message or packet is established. These are transmitted only over the tunnel. When the malicious node is neighboring to the traditional central node, the message is transmitted without using the data itself (see Figure   2). Follow data from each moving node at that stage and accept a message that aids in data collection. The execution of the system can be expanded by specifying the essential role. At that point, to construct a dataset that was marked with the support of an outstanding hub address, we selected eight significant features. Therefore, six standard machine learning classifiers that specifically organize ordinary and malicious data from study samples into two categories apply.
Device efficiency is measured based on multiple mathematical criteria and compared to the new techniques.

Simulation of wormhole attack
With a finite number of nodes, we have simulated wormhole attacks in the Matlab 2019b set. It generates a network topology consisting of the protocol of the node, computer, channel, and network. Different network programs transfer packets over a network in this simulation process.
Packets are either generated or approved and processed, and the simulation model execution reaches the primary role and is processed until the termination state. The original location of nodes and contact nodes against their adjacent nodes is seen in Figure 3. This simulation was done in an ad hoc network environment with 48 regular nodes and two malicious nodes. Topology room 1000x1000 m2, spontaneous node activity, and the 250-meter radio range of a node are the simulation environment's experimental parameters (1000 for wormhole nodes). Regarding Fig. 3, the normal nodes are indicated with red circles, and wormhole nodes are illustrated with black triangles. Moreover, the initial connection is shown with blue lines between nodes.

Feature extraction results
The selection of features is one of the central principles of machine learning that directly influences its performance. Unrelated or partly related functions may adversely impact the output of the device. The output file includes full node information in which only any of the data for a given application is informative. Whenever irrelevant or less informative features that do not lead to classification are omitted, it may pick similar features for the dataset. There are many benefits of feature selection, such as decreasing overfitting, reducing training time, improving accuracy, etc. We have chosen eight essential features that optimize the system's performance. Table 2 includes the characteristics of the MANET network presented. Such attributes are either continuous or discrete. We use the specific node address to mark samples and presume that malicious nodes often yield malicious samples.

Results of classification
The   Fig. 4 The confusion matrix of the utilized machine learning methods The cross-entropy loss for multi-class classification problems with mutually exclusive classes The CNN architecture used in this paper is presented in Table 3. The input layer consists of 8 features for every 3997 nodes. Therefore, the input matrix size is 8x1. We also used two convolutional layers with ten filters with 2x2 size and stride [1,1] and zero paddings. Moreover, for activating the layers, we used the Tanh and ReLU functions. Then two fully connected layers are used with 384 and 2 cells, respectively. Finally, the SoftMax layer is used to find probability and to activate the final layers. Then the classification layer is used based on the cross-entropy with considering mutually exclusive classes. The results of the classification process are indicated in Figure 5. The process is performed with core i7, Intel processor with 3 GHz CPU and 12GB RAM. The training process is done for 3000 iterations. The accuracy and loss value of the training process is depicted in Figure 5 for a better analysis of the machine learning classifiers, the ROC curve is represented in Figure 6 based on binary classification. In the ROC curve, the horizontal axis shows the false positive rate, and the vertical axis indicates the true positive rate. In other words, the ROC curve is depicted, with consideration of wormhole nodes as the positive class. One of the essential criteria for performance analysis of the classifier is the area under the curve of ROC curve that is called AUC. It can be seen that the DT classifier resulted in high AUC than other methods.  The results of the comparison between machine learning methods are shown in Table 4. Based on results, the sensitivity of the DT method outperforms other approaches. The sensitivity indicates the power of the method to detect wormhole nodes in MANET. Therefore, the magnitude of it represented the potential of the classifiers. In other words, the sensitivity of the DTclassifier is higher than other methods. The precision also shows the potential of results or reliability of the method. For instance, the precision of the SVM method is 91.3%. It means that, from all nodes that the SVM recognized as wormhole nodes, 91.3% are the positive test of the real wormhole. The specificity also shows that how the classifier detects the normal node. The higher specificity is belonging to KNN and SVM approach. Finally, the higher AUC value has resulted from the DT method. To conclude the results, the DT method's accuracy is 98.9% and higher than other methods. In the next priority, SVM, KNN, CNN, LDA, and NB indicate high accuracy, respectively.

Conclusion
A wormhole attack is a type of attack on the network layer that reflects routing protocols. To detect wormhole attacks using machine learning, a training dataset is required to train models in any training mode. Training datasets can be obtained from real-time conditions or tests for classification. As a function, the experimental data may be defined as a target value and a

Funding
The funding sources had no involvement in the study design, collection, analysis or interpretation of data, writing of the manuscript or in the decision to submit the manuscript for publication.
Declaration of interests we declare no conflict of interest.