Enhancing Industrial Wireless Communication Security Using Deep Learning Architecture-Based Channel Frequency Response

Wireless communication plays a crucial role in the automation process in the industrial environment. However, the open nature of wireless communication renders industrial wireless sensor networks susceptible to malicious attacks that impersonate authorized nodes. The heterogeneity of the wireless transmission channel, coupled with hardware and software limitations, further complicates the issue of secure authentication. This form of communication urgently requires a lightweight authentication technique characterized by low complexity and high security, as inadequately secure communication could jeopardize the evolution of industrial devices. These requirements are met through the introduction of physical layer authentication. This article proposes novel deep learning (DL) models designed to enhance physical layer authentication by autonomously learning from the frequency domain without relying on expert features. Experimental results demonstrate the effectiveness of the proposed models, showcasing a signi ﬁ cant enhancement in authentication accuracy. Furthermore, the study explores the ef ﬁ cacy of various DL architecture settings and traditional machine learning approaches through a comprehensive comparative analysis.


Introduction
Industrial wireless sensor networks (IWSN) have received increased attention in most industries in recent years thanks to their flexible nature and ability to work in challenging environments.IWSN can be used in various manufacturing applications, such as industrial control and process automation.However, the accessibility of IWSN raises concerns about security and privacy.Consequently, vulnerabilities may arise, primarily linked to the potential for adversaries to eavesdrop through the unauthorized interception of communications between industrial nodes, jam communications between nodes by flooding a channel with noise, or spoof communications by transmitting a forged signal between two authorized nodes.The wireless network must implement security procedures for access control and prevent unauthorized users.Every connection between nodes, whether involving a limited-access device or a smart device, is significant in terms of security.In the field of IWSN, addressing confidentiality, integrity, and availability is crucial as the foremost requirement for information security [1].
Authentication is an essential challenge in IWSN, where node identification requires safeguarding wireless communications to determine nodes' authenticity and grant them access while blocking unauthorized devices [2].Wireless communication systems execute authentication using upper-layer authentication methods, which generally employ cryptographic algorithms [2].Traditional authentication methods rely on vulnerable addresses, especially IP and MAC addresses.Nevertheless, upper-layer authentication methods that use traditional cryptography algorithms are insufficient for advanced wireless communication technologies [3], such as the Internet of Things (IoT) and the Internet of Vehicles.As artificial intelligence (AI) technologies progress, various cybersecurity techniques explore methods to leverage the capabilities offered by AI.Machine learning (ML) and deep learning (DL) enhancements have recently strengthened classification reliability within authentication systems.
Due to the open nature of wireless communication, adversaries that imitate authorized users can compromise IWSN.Therefore, the heterogeneous nature of the wireless transmission channel and the limited hardware and software capabilities of the nodes make secure authentication in an IWSN more challenging.Consequently, determining the source of unknown wireless transmitters and mitigating security dangers raised by adversaries requires a simple and lightweight authentication solution.
In this article, we present an uncomplicated authentication model.The model is formulated to enact lightweight authentication by leveraging the unique characteristics of the physical layer within the transmission medium and is integrated with DL approaches.
The main contributions of this article are summarized as follows: (i) This article introduces the wireless physical layer authentication (WPLA) model designed for wireless industrial device communications.The model employs DL architectures incorporating autonomous parameter optimization, thereby serving as a substitution for traditional ML algorithms.(ii) We have developed a representative sequential architecture consisting of two layers of convolutional neural networks (CNNs), two layers of long short-term memory (LSTM) architecture, and an architecture based on restricted Boltzmann machines (RBMs).This composite architecture is systematically designed to ascertain the optimal model for physical layer authentication.(iii) The conducted experiments utilized the publicly available dataset provided by the National Institute of Standards and Technology (NIST) [4].We propose a model that systematically explores channel impulse response (CIR) and transitions to channel frequency response (CFR) as a strategic approach to attaining reliable performance.(iv) Furthermore, a comparative analysis has been undertaken to scrutinize the impact of varied DL structural configurations and traditional ML algorithms on classification performance.
The remainder of this article is organized as follows: In Section 2, an overview of the background of DL architectures.Related works are discussed in Section 3. In Section 4, the system model is briefly described.Then, in Section 5, a physical layer authentication model is proposed.Section 6 presents the result and discusses the proposed models' performance.Finally, the conclusion is presented in Section 7.

Background
DL computing has emerged as the predominant paradigm within ML, exhibiting remarkable efficacy in diverse, intricate cognitive investigations.Notably, DL has outperformed established ML methodologies across multiple domains, which can be attributed to its enhanced capabilities in data analysis.

CNN.
The CNN is typically designed for processing data with a known grid-like structure.The convolutional layer plays a crucial role in the CNN architecture, where each layer comprises multiple filters working collaboratively to conduct extensive convolutions on the input, resulting in the creation of feature maps.The learning algorithm trains these filters through a backpropagation mechanism, often represented as a multidimensional array of parameters [5].To achieve an appropriate level of abstraction at the correct scale, filter sizes are selected based on the dimensions of the input data.
For instance, a convolutional layer convolves over the 2D input x using 2D filters k to extract features, represented as follows: After the convolution, a bias term b and a point-wise nonlinearity f are utilized to create a feature map at the filter output.The feature map is created as follows using filters based on the input x and weights W: A pooling layer is applied after a convolutional layer to perform a down-sampling operation, reducing the in-plane  3)-( 9) delineate the fundamental structure of the LSTM model [6].This learning process relies on semilabeled datasets to facilitate the learning process.
The information from the new input, X t , is decided upon and stored in the cell state, C t , by the first gate, the input gate.This phase comprises two components: the sigmoid and tanh functions.The sigmoid function determines whether to update the data based on the new information, while the tanh function weighs the values passed to assess their relative importance.Conversely, according to the sigmoid function, the forget gate is employed to determine unnecessary information that will be excluded from the cell.The last gate is an output gate, where a filtered version of the output cell state generates the output values, H t .A sigmoid function selects the components of the cell state sent to the output, and the new values produced by the tan h from the cell state, C t , are multiplied by the outcome of the sigmoid function.
Sigmoid function: Gate: Input transform: State update: Here, σ represents the sigmoid function, H t−1 is the output of the last LSTM unit at time t−1, X t is the current input at time t, and W and b are the weight matrices and bias, respectively.Additionally, C t−1 and C t denote the cell states at t−1 and t. 2.3.RBMs.RBMs are undirected probabilistic models with two layers: visible and hidden units [7].To optimize the likelihood function, one must identify the joint probability distribution.RBMs develop neural network topologies for unsupervised data modeling using the concept of energy minimization [9].The term "restricted" refers to the absence of intralayer connections between the visible and hidden units.However, connections between visible units and hidden units are still allowed.
There are two essential stages in the operation of RBMs: the feed-forward pass and the feed-backward pass.During the feed-forward pass, the inputs are multiplied by the weights, and the bias is applied.The result is then fed into a sigmoid activation function, with the function's output determining the activation of the hidden state.Conversely, the feed-backward pass reconstructs the functionality of the input units using the activated hidden units.The dual connection structure assumes conditional distributions of visible units on all hidden units, as well as distributions of hidden units on all visible units [5,10], which are defined as follows: Originally, RBMs were proposed for use with binary visible units (v 2 {0, 1}) and hidden units (h 2 {0, 1}).The bias vector is denoted by the letter b, and the weight of the link between the ith visible unit and the jth hidden unit is represented as W ij .

Related Work
This section provides an overview of the pertinent literature on the WPLA approach.WPLA is an authentication method designed to identify a wireless transmitter by analyzing physical layer characteristics within the transmission [11].The focus of scholarly inquiry within WPLA has been directed toward methodologies such as radio frequency fingerprint (RFF) and channel-based techniques.The inception of RFF technology dates to 1995, when Toonstra and Kinsner [12] first proposed its conceptual framework.Rooted in exploiting manufacturing defects in wireless device components, RFF leverages minor imperfections inherent in the launch signal.Drawing a parallel to human biometric fingerprint identifiers, RFF is a practical and effective technique for authenticating wireless devices based on hardware irregularities.

IET Signal Processing
On the other hand, due to the uniqueness, space-variability, time-variability, and reciprocity of the wireless channel, the physical layer features demonstrate the unique nature of the wireless channels used by the transmission parties.By analyzing channel features, for example, channel state information (CSI), CIR, CFR, and received signal strength.Signal classification and identification have become required with the advancement of wireless communication technologies.Described signal intelligence is a subject of research based on extracting signal features from unidentified radio frequency signals, which include modulation, center frequency, bandwidth, protocols, and transmitter identity [13].Significant studies have implemented their classification methods by utilizing RFF [14] and channel features [15][16][17][18].
In recent years, DL has been increasingly integrated with WPLA within the context of secure wireless networks.This section introduces the pertinent literature exploring the intersections between DL and WPLA.Due to DL's excellent classification capabilities, deep neural networks perform exceptionally well when used for authentication.Liao et al. [16] created a multiuser authentication approach that could recognize numerous devices with little resource consumption using deep neural networks with data augmentation techniques.However, CNNs were chosen in some research because of their dependability and robust learning capabilities, which have high accuracy and low loss function during training.Baldini et al. [19] used CNN and recurrence plot techniques to develop classification approaches for the physical layer authentication challenge.To identify different devices by utilizing distinctive RF fingerprints, Aminuddin et al. [20] presented a methodology based on CNN to secure wireless transmission in a wireless local area network.Liao et al. [21] adopted deep neural networks, CNN, and convolution preprocessing neural networks to perform physical layer authentication in IWSN.
Furthermore, some research examined the relationship between the number of hidden layers and authentication rates, and it was discovered that the authentication rate improved as the number of hidden layers increased.In contrast, Ma et al. [22] used LSTM as an effective classifier to determine authorized and unauthorized users and increase detection efficiency and accuracy through simulations with varied channel conditions.Chatterjee et al. [23] demonstrated how to use the radiofrequency properties that the wireless device produces to authenticate devices in IoT networks as a trustworthy physical, unclonable function.Accordingly, a simple ML model was designed to consider receiver imperfections, channel, and data unpredictability.

System Model
We propose a WPLA model aimed at enhancing the security of IWSN with minimal impact on communication resources.Figure 1 illustrates numerous sensor nodes deployed across various locations within the industrial environment.We consider different wireless sensor nodes within the broadcast range of other nodes.For the sake of simplicity, we assume the availability of channel information for authorized nodes while the channel information for unauthorized nodes remains unknown.The computational node is responsible for validating the legitimacy of received messages and ascertaining their origin from an authorized node based on channel information.
To initiate the authentication process, the computing node cyclically disseminates request messages to nodes within the industrial wireless network.Upon receipt of the message, all nodes transmit response messages to the computing node, incorporating both the pilot signal and identity information.Subsequently, upon receiving these response messages, the computing node conducts an initial assessment to ascertain the node type and detect potential identity conflicts by comparing the provided information with the stored identity details of authorized nodes.In the presence of conflicting declarations of identity, the node may be deemed unauthorized.Conversely, if the identity remains consistent, a node with an identical ID may be flagged as potentially malicious.Thus, the authentication process is briefly characterized as follows: The detailed sequence diagram is depicted in Figure 2, where Node 1 is designated as authorized, Node 2 is defined as malicious, and Node 3 is unauthorized.The computing node is involved in identity authentication for the three nodes.When a valid node attempts to gain access, the computing node initially determines whether it is an authorized or malicious node.Subsequently, it undergoes physical layer channel authentication.Similarly, suppose a node requesting authentication lacks valid identity credentials.In that case, the computing node is initially categorized as an unauthorized node with incorrect identity information before proceeding with physical layer channel authentication.From a mathematical standpoint, the communication channel linking a transmitter and receiver is characterized by CIRs, serving as a comprehensive model that encapsulates the cumulative impact of various elements such as reflectors, absorbers, path loss, and environmental intricacies between them [24].In particular, CIRs encompass all signal paths from the transmitter to the receiver, including those resulting from reflection, diffraction, or scattering [25].Moreover, CIRs provide insights into both the propagation conditions and the positions of the receiver and transmitter within the given environment.

Signal Processing.
In the following, the processing of CIRs is described in a simplified manner.Usually, CIRs are obtained by transmitting a pseudo-random sequence s(t) known to both the transmitter and receiver.This property can be exploited to estimate the signal propagation channel.A convolution of s(t) with the CIR that results in the received signal y(t) is a straightforward model for the signal's propagation.
where * denotes convolution.n y (t) represents the noise components, modeled as zero-mean white Gaussian noise.We denote the continuous-time CIR of an L-path baseband wireless communication channel as follows: where δ (t−τ i ) is the Dirac delta function representing a delayed multipath replica of the transmitted signal arriving at time τ i with power |h i | 2 .In particular, h i = a i e jθi , where a i and θ i denote the amplitude and phase of the ith replica.We note that h(t) fully describes the communication channel between the transmitter and receiver.The complex received signal consists of the in-phase (I) and quadrature (Q), I i = a i cos (θ i ), and Q i = a i sin (θ i ), determine the in-phase and quadrature.
The signal samples x(t) 2 C, t = 0,…, t−1 are a time series of complex raw samples that are characterized as a data vector.This study considers simple data representation as a fast Fourier transform (FFT) converted the CIR into a discrete CFR.The FFT is a mathematical operation that converts signals from the time domain to the frequency domain.In order to reduce the number of calculations required for the FFT, discrete Fourier transform is used [26].
Here, X(i) characterizes the frequency content of the time samples x(n) associated with the original signal x(t).
The result of the transformation, the FFT vector, consists of two sets of values: one that carries the real component and another that holds the imaginary component.Next, we convert the FFT vector to amplitude A i by performing: The last step in this phase is splitting the dataset into a training set and a test set, where each set is composed of the following: The set of CFR vectors for each node is termed Xset, and the classification label for each node's CFR vector is termed Yset.

Training Authentication
Model.This section explores the utilization of DL architectures in the authentication model to address the inherent high dimensionality of CIRs.The rationale behind this selection is grounded in the various advantages that DL architectures offer, particularly in addressing the challenges associated with high-dimensional pattern identification, as substantiated by previous research findings [27,28].

CNN Model Description.
In the following section, we outline the visible and hidden layers of the proposed CNN model structure, as depicted in Figure 4. We begin by loading the CFR input sample in the visible layer.Subsequently, we reshape the sample in both the training and test sets to a fixed size of (1,1,8,188).The hidden layers consist of two convolution layers.The first convolutional layer comprises 256 neurons, followed by a Dropout layer.The second layer is a convolutional layer composed of 128 neurons, followed by a Dropout layer.Following a flattening layer, the combination of dropout regularization and the max norm has demonstrated excellent performance in preventing overfitting.The penultimate layers in CNN structure are the dense layers, which include neurons fully connected with all feature maps in the convolution layers.The first dense layer contains 64 neurons, and ReLU activation functions are applied to accelerate convergence during the training process.Finally, the last dense layer utilizes softmax activation to perform node classification.

LSTM Model Description.
The proposed LSTM model with different layers is demonstrated in Figure 5. First, the CFR of the signal is fed to all neurons of the LSTM model for classification.The first layer contains 256 neurons, whereas the second has 128 neurons, and tanh activation functions are used in both layers.The flattened layer receives the 128dimensional vector, the final output of the second LSTM layer.The penultimate layers in the LSTM structure are the dense layers, consisting of neurons fully connected with all feature maps in the LSTM layers.Then, the first dense layer contains 64 neurons, and ReLU activation functions are used.A dense softmax layer is the final layer that places the nodes' categorized features into one of eight output classes.

RBMs Model Description.
The proposed RBMs model structure is a composite of one layer of an RBMs structure and a support vector machine (SVM) layer, as illustrated in Figure 6.RBMs represent a form of probabilistic modeling based on unsupervised nonlinear feature learning.Standard RBMs are binary; the input and output only have "0" and "1" states, where "0" indicates that the unit is inactive and "1" means that the unit is active [29].In the proposed model, we utilized Bernoulli RBMs, where all units are binary stochastic.This requires that the input data be either binary or realvalued, falling between 0 and 1 [30].Consequently, we normalized the dataset before training the model.After the feature extraction in the RBMs layer, the obtained features are sent to the softmax layer.Typically, good results come from feeding an RBM or a hierarchy of RBMs' features into a linear classifier like a linear SVM or a perceptron [30].Hence, we incorporated an SVM classifier in the last layer to optimize authentication performance.

Evaluation Results and Discussion
This section delineates the implementation details of the models.Subsequently, it provides an exposition on the performance evaluation of the proposed models for authentication.Furthermore, an examination and comparative analysis of the authentication and convergence rates between the proposed DL models and traditional ML algorithms are conducted.Additionally, training and validation loss profiles are assessed to identify an optimal structure configuration.Finally, strategies to mitigate the issue of model overfitting are discussed.

Dataset Description.
To obtain a CIR dataset within an industrial environment, authentic datasets sourced from the NIST [4] were utilized.The CIRs were acquired within the IET Signal Processing confines of a standard industrial site, specifically a machine shop.The machine shop, an indoor environment, possesses outer dimensions measuring approximately 12 by 15 m, with a ceiling height of roughly 7.6 m.The distance between the transmitter and receiver was at most 50 m.Channel-sounding tests were conducted using two portable devices, a transmitter and a receiver, strategically positioned within the factory.The capture of the CIR transpired as the receiving equipment moved from one acquisition point to another during the CSI measurements, resulting in each record denoting a distinct position.Consequently, the maximum distance between successive acquisitions is restricted to 1 m.The specific parameter configuration of the utilized dataset is detailed in Table 1.
6.2.Implementation Details.The proposed models have been implemented using Python, with Keras and TensorFlow employed for the CNN and LSTM models.Additionally, Sklearn was utilized for implementing the RBMs, SVM, and KMeans models.The training of these models was conducted on a workstation equipped with an NVIDIA graphics card and a high-performance CPU, such as the Intel Core i7, paired with 16 GB of RAM.The dataset used for our evaluation analysis comprises 10,000 samples systematically partitioned into two distinct subsets: the training set, which consists of 8,000 samples (80%), and the test set, encompassing 2,000 samples (20%).Within this dataset, there are eight nodes, with three nodes designated as authorized.Additionally, the dataset includes five unauthorized nodes, which are classified as three malicious nodes and two identified as unknown.The training set was further divided into training and validation sets at a ratio of 7 : 1.The model training process was executed with parameters as outlined in Table 2, providing a comprehensive overview of the diverse hyperparameters employed in configuring the proposed models.

Evaluation Metrics.
Evaluating the WPLA model's performance involved assessing several key performance indicators, including accuracy, precision, recall, and F1-score.The subsequent section outlines the definitions and formulas associated with these metrics.
The metrics are characterized by true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).Accuracy measures the classifier's ratio of correctly predicting a node class.Precision represents the ratio of all positively predicted classes that are both positive and correct.Recall, or sensitivity, gauges the proportion of the model's predicted positive and correct classes relative to the total number of actual positive classes.The F1 score offers a balanced assessment by amalgamating precision and recall into a unified metric.
6.4.Comparative Analysis.A comparative analysis of the WPLA model used both DL architectures and traditional ML algorithms.This was undertaken to enhance the accuracy of evaluating the proposed authentication methodology.Each model's performance was assessed within a consistent framework, with uniform data dimensions set at 8,188. Figure 7 and Table 3 show that the DL models outperform their traditional ML counterparts.Specifically, SVMs and KMeans, representative of traditional ML models, exhibited comparatively inferior authentication performance.
In instances where DL architectures result in a 40% enhancement in performance, it is noteworthy that the model Gaussian white noise is deliberately introduced into the extracted CFR data to assess the robustness of the proposed authentication models.This simulation aims to evaluate the model's performance under varying signal-to-noise ratio (SNR) conditions, and the associated authentication evaluation results are presented in Figure 8 and Table 4. High scores across a variety of metrics, consistently ranging between 98% and 99%, are evidence that the CNN model's performance exhibits resilience to SNR variations.
In contrast, the LSTM model's performance shows sensitivity to SNR changes.This sensitivity becomes apparent compared to the CNN, particularly under varying SNR conditions.Notably, the LSTM model demonstrates superior performance at 0 dB compared to CNN; however, this advantage decreases with changes in SNR.Specifically, at −15 dB, the LSTM model shows an accuracy of 87%, precision of 89%, and recall of 86%.Similarly, the RBM model shows sensitivity to changes in SNR.Positive and incremental effects on all  Within the models of CNN, LSTM networks, and RBMs, the loss function employed was categorical cross-entropy, expressed as follows: where p i refers to the softmax probability, t i signifies the ground truth in the form of one-hot encoding, and the training batch size is indicated by N.
The evaluation of the discrepancy between the actual and predicted probability distributions for each class in the given problem is quantified through cross-entropy as a scoring metric.The attainment of a minimized score corresponds to an optimal cross-entropy value of 0. The evolution of

Conclusion
In the realm of IWSN, we present a WPLA model.This model autonomously learns from the frequency domain to improve identification performance and efficiency.Utilizing intelligent classifiers, both DL architectures and traditional ML algorithms are employed for physical layer authentication.The findings indicate that the proposed models show excellent performance, resulting in significantly improved authentication accuracy.High scores across various evaluation metrics indicate that the CNN model demonstrated exceptional performance, displaying resilience to SNR variations.
Finally, DL architectures provide practical solutions to training time and performance challenges, thereby offering a significant advantage in enhancing security systems.Despite these advancements, there remains considerable potential for further improvement in wireless communication security systems through DL architectures.Our future research initiatives will focus on establishing a generative adversarial network model to evaluate the capabilities of the proposed authentication model.

FIGURE 1 :FIGURE 3 :
FIGURE 1: Illustration of our considered communication system model.

FIGURE 10 :
FIGURE 10: The cross-entropy loss over epochs for the LSTM model.

FIGURE 11 :
FIGURE 11:  The cross-entropy loss over epochs for the RBMs model.

TABLE 1 :
NIST dataset parameters configuration of the channel measurement system.

TABLE 3 :
Result comparison for different DL and ML models.

TABLE 4 :
Result comparison under different SNRs.FIGURE 9: The cross-entropy loss over epochs for the CNN model.10IET Signal Processing cross-entropy loss over epochs for both the training and validation datasets is represented in Figures 9-11.Regarding the CNN and LSTM models, the depicted figures illustrate a consistent reduction in training and validation losses, suggesting an optimal fit.This observation implies a wellconfigured model, as there is an absence of discernible signs of overfitting or underfitting.Conversely, in the case of the RBMS model, the diminishment in both training and validation losses is comparatively less noticeable.Table 5 displays that the prediction rate for both the CNN and LSTM models approaches 100% when the parameters are suitably selected.Notably, the CNN model exhibits the fastest training time compared to the other models.Conversely, the LSTM model demonstrates a notably reduced average training and prediction time, only 5 s per epoch and 1 µs per sample.

TABLE 5 :
Training information and accuracy comparison between DL models.