Deep Learning-Based Channel Reciprocity Learning for Physical Layer Secret Key Generation

School of Computer Science & School of Software Engineering, Sichuan University, Chengdu 610065, China Sichuan GreatWall Computer System Co., Ltd, Luzhou 646000, China CEC Jiutian Intelligent Technology Co., Ltd, Shuangliu District, Chengdu, Sichuan 610299, China Science and Technology on Security Communication Laboratory, Institute of Southwestern Communication, Chengdu 610041, China Chongqing Innovation Center of Industrial Big-Data Co., Ltd, Chongqing 400707, China Institute for Industrial Internet Research, Sichuan University, Chengdu 610065, China


Introduction
Physical layer key generation is a promising encryption technique in wireless systems, which has recently received much attention [1][2][3]. In contrast to its traditional counterpart, physical layer key generation uses the inherent unpredictability of channel changes between valid users to establish information-theoretic security without the assistance of a third party. is channel-based scheme achieves stronger security, higher key generation rate, and lower costs than the conventional ways [4][5][6]. With the development of emerging wireless systems, e.g., IoT, Internet of Vehicles (IoV), and Unmanned Aerial Vehicles (UAV), the physical layer key generation becomes popular in recent years [7][8][9]. e basic concept behind physical layer key generation is that two legitimate devices named Alice and Bob exchange information publicly to acquire channel responses, which are then utilized to extract symmetric keys. e viability of this procedure is based on three principles: temporal variation, channel reciprocity, and spatial decorrelation [2]. Among them, channel reciprocity requires that the channel features collected by Alice and Bob within the coherence time be highly correlated.
is is fundamental for legal devices to produce matching keys from channel features separately. However, key generation is most commonly used in time-division duplex (TDD) systems, since nonsimultaneous sampling, channel noise, and hardware diversity significantly weaken the correlation of channel measurements between two authorized users [10]. us, it is crucial to construct reciprocal channel features in a nonideal wireless channel.
Intentionally designed feature extraction methods can capture reciprocal component within channel measurements. Some researchers use conventional linear transforms to extract characteristics, such as principal component analysis (PCA) [10,11], discrete cosine transform (DCT) [12], and wavelet transform (WT) [13]. Several applied nonlinear feature extraction approaches outperform the linear transform, such as [14,15]. In simulation experiments, these approaches provide good key agreement and a high generation rate, but they lack robustness in practical wireless situations. ere are some preprocessing algorithms that are specifically developed to exploit reciprocity from channel measurements [16][17][18]. On the one hand, these methods are computationally expensive or have security vulnerabilities. On the other hand, these human-crafted feature extraction approaches are based on personal observations and are inflexible in diverse real-world environments.
Deep learning is a powerful features extraction technology, which does not need a predefined statistical characteristic of the channel model. In the field of wireless communication and networking, various deep learning applications have emerged in recent decades, including resource allocation [19], channel estimation [20], modulation classification [21], and low rate CSI feedback [22]. However, few studies have focused on applying deep learning to capture reciprocity in the field of physical layer key generation. Existing key generation methods depend on the prior knowledge of wireless channel and cannot work properly in the situations that do not follow the statistical distribution assumption of the channel response. erefore, we intend to leverage powerful feature learning capability of DL to adaptively construct consistent secret keys from the imperfect channel in real-world wireless systems.
In this paper, by employing deep learning, we propose a novel method for capturing reciprocal channel characteristics to efficiently generate secret keys. Specifically, we design the Channel Reciprocity Learning Net (CRLNet), a multibranch autoencoder-based neural network based on the prior knowledge of the channel measurement model. e proposed model is driven by the channel state information (CSI) of the TDD Orthogonal Frequency-Division Multiplexing (OFDM) system, which can achieve a higher key generation rate than RSS [2]. e CRLNet can be trained to adaptively learn the reciprocity components in weakly correlated channels using the collected CSI data. Furthermore, a complete key generation scheme is designed based on the CRLNet. To demonstrate the validity and effectiveness of our algorithm, we conduct comprehensive testing in various real-world wireless environments. e following are our primary contributions in detail.
(1) We design a novel multibranch autoencoder-based neural network, named CRLNet, and a special hybrid loss function to train the network according to the channel measurement model. Without any knowledge of the statistical distribution of channel responses, the model trained utilizing the CSI data from the commercial WiFi devices can construct highly correlated channel features that can be quantized into high-agreement secret keys. (2) Based on our deep learning model, a practical secret key generation mechanism is developed. In contrast to the existing method, the proposed scheme achieves higher performance without high computing overhead or security risks. (3) Extensive testing results conducted under static and mobile environments in both indoor and outdoor scenarios show that the proposed method is feasible and effective. A superior key generation rate, key error rate, and randomness are all achieved as compared to previous methods. e rest of this paper is arranged as follows. Related Works shows relevant researches. e standard flow for secret key generation and the reciprocal feature learning algorithm developed by us are shown in Materials and Methods. e designed deep learning based key generation scheme is also included in this section. e experiments used to evaluate the proposed method's performance are presented in the Results and Discussion, which is preceded by the Conclusions.

Related Works
Physical layer key generation was first studied in the mid-1990s. It has been demonstrated that extracting secret keys from wireless channel characteristics can achieve reliable information-theoretic security [23]. In [24], the authors collect the sender's signal in the IEEE 802.11 wireless network and extract the RSS estimation in the channel to quantify it into secret keys. e authors in [25] analyze the impact of changes in the environment on the performance of the RSS-based method and discover that the key generation rate was higher in a dynamic environment. Due to the limited key generation rate and insufficient randomness of RSS-based methods, [16,26,27] exploit fine-grained channel state information (CSI) to achieve better performance. In addition, [16] proves that CSI-based methods are immune to attacks that RSS-based methods are vulnerable to, such as predicted channel attack and stalker attack. However, since the majority of these researches are limited to theoretical analysis and simulation studies, it is difficult to demonstrate their feasibility and generality in the real wireless environment.
Several deep learning based methods have recently emerged for extracting meaningful features from physical layer channel responses [28][29][30][31][32]. O'Shea and Hoydis [28] show several potential applications of deep learning in the physical layer. e authors model the wireless communication as the end-to-end autoencoder and achieve better performance than conventional methods. Abyaneh et al. [29] use convolutional neural networks to extract features from CSI, which improves the accuracy of the physical layer authentication system. Liu et al. [30] propose a self-supervised learning framework for IoT applications to learn the underlying physical features of sensing signals. Huang et al. [31] design a deep neural network for channel calibration in massive MIMO systems, which can construct a nonlinear mapping between DL and UL channels. Zhang et al. [32] use a fully connected neural network to learn the mapping function between CSI of different frequency bands in an FDD system. Inspired by these works, we apply deep learning to learn reciprocal features from the imperfect channel to generate consistent secret keys in complex wireless communication.

Secret Key Generation Flow.
Generally, key generation based on physical channel characteristics includes five steps [2].
(1) Channel probing: legitimate devices named Alice and Bob periodically exchange probing packets to facilitate channel estimation in receiving end. Assuming the wireless channel response recorded at Alice and Bob are H a ′ and H b ′ , respectively, as expressed below, where u � a, b { } represent channel response of Alice and Bob, H u is the wireless channel response of the perfect channel, and N u is the independent nonreciprocal components in the both ends of wireless communication.
(2) Reciprocity feature extraction: as mentioned above, reciprocity is greatly weakened by nonsimultaneous measurements due to the TDD system and separate noise residing in various devices, as present in Figure 1. Because the reciprocal components of the channel are mixed with variable nonreciprocal components in the environment, it is difficult to directly extract matching key pairs from the results of channel estimation. e reciprocity feature extraction method is therefore in charge of extracting the reciprocity feature from the original channel response.
(3) Quantization: this stage's goal is to convert the channel measurements into a bit sequence.
Depending on the wireless communication environment, different quantization levels should be specified, resulting in a compromise among the key generation rate and the key error rate [6]. (4) Information reconciliation: the initial generated keys are not all exactly the same. Reconciliation is used to correct the mismatched bits in the key. e main methods include Cascade [33], error correcting code (ECC) [34], BCH code [35], and Golay code [36]. (5) Privacy amplifying: during the information reconciliation stage, Alice and Bob transmit some information that eavesdroppers may hear. To ensure the security of the generated key, the hash function is generally used to convert the corrected initial key to fixed-length final keys that can be used directly in cryptographic techniques [37].
Reciprocity feature learning is the most essential phase in physical layer key generation, with a significant influence on key error rate, key generation rate, and randomness of the generated key. e initial key with a lower key error rate can facilitate the subsequent stages. erefore, we build a neural network model capable of learning the reciprocity component from the channel response.

Reciprocity Learning Design.
e major focus of this research is to extract the reciprocity component from the channel response in order to generate extremely consistent keys. We present the Channel Reciprocity Learning Net (CRLNet) to efficiently learn the reciprocal component of the original channel response. e design of CRLNet is based on formula (1) and its structure is displayed in Figure 2. To eliminate N u from channel response as much as possible, we designed a nonreciprocity learning module (NRLM), which consists of three hidden linear layers whose numbers of neurons are 512, 256, and 256, its input is the origin channel response, and the output is nonreciprocal component expressed as N u . e multibranch autoencoder part is a symmetrical structure, which consists of two encoders (Encoder a , Encoder b ) and a shared weight decoder (Decoder). e whole CRLNet is composed of a multibranch autoencoder and two NRLMs.
During the training phase, the input of the neural network is paired CSI (H a ′ , H b ′ ) records by Alice and Bob,  and the output is H a and H b , which suppress the nonreciprocal part and are highly correlated. e encoded primary features Z a of channel response are learned by Encoder a from H a ′ , whereas NRLM a is employed to distinguish the nonreciprocal component. Encoder b and NRLM b do the same operation on H b ′ as Encoder a and NRLM a . e shared weight decoder is utilized to reconstruct H a and H b from strongly reciprocal encoded features Z a , Z b . Finally, the desired reciprocal compressed features Z a and Z b can directly be used for quantization. e reason why we choose Z a , Z b instead of H a and H b that is also highly correlated is that they eliminate redundancy from adjacent subcarriers that may mitigate randomness [26].
We propose a hybrid loss function, which consists of three parts: (1) We adopt the mean squared error loss for Z a and Z b to enforce the proposed neural network to learn the reciprocity between paired channel responses, as shown below: where Z a , Z b represent output of Encoder a and Encoder b respectively, and m is the number of training samples. (2) e second loss function is based on formula (1). It is introduced to minimize the difference between the channel response recovered by the decoder plus the nonreciprocal part extracted by NRLM and the original input of Alice, as shown below: where H i a is the output of Decoder for Z i a , N a is the noise residing in Alice side, and H i′ a is imperfect CSI input from Alice. e result of loss restricts the difference between the result of reconstruction plus the nonreciprocal part extracted by NLRM and the original input to be small enough to ensure that the encoder has really learned the main channel features.
(3) e third loss function is similar to the second, but it is for Bob's channel response, as shown below: where H i b is the output of Decoder for Z i b , N b is the noise residing in Bob side, and H i′ b is imperfect CSI input from Bob. e proposed final loss function is expressed as below:

Data Collection and Preprocessing of Dataset.
Our data is collected in a variety of scenarios, including mobile and static, and in both indoor and outdoor situations. Using two Lenovo X220 laptops equipped with the Intel WiFi Link 5300 wireless card, we acquired 100,000 pairs of CSI data in each scenario. e signal from two transmitting antennas and three receiving antennas are recorded and parsed to CSI values by the Linux 802.11n CSI Tool [38]. By utilizing the antenna diversity [16,26], the estimated CSI has better randomness than a Single-Input and Single-Output (SISO) system. Each packet's CSI value is a complex matrix with the shape of 30 × 2 × 3, which extracts from 30 subcarriers. Since neural networks cannot deal with complex numbers, the first step in preprocessing the datasets is to stack the real and imaginary parts of CSI matrix H u ′ , which can be defined as where Real(·), Imag(·) denote the real and imaginary parts of the CSI matrix. e stacked matrix is then flattened to a one-dimensional vector with a size of 360. Because each dimension of the raw input regular has a distinctive magnitude, we normalize the datasets so that their range is between 0 and 1. e process for normalizing is as follows: where H ′l u max and H ′l u min are the max value and minimum value of l th dimension of H u ′ , H ′l u is the l th element of the H u ′ , and H ′l u norm is the normalized l th element of H u ′ .

Secret Key Generation Scheme Based on Reciprocity
Learning. We design a complete key generation scheme based on our proposed reciprocity learning algorithm, which is present in Figure 3.
In the training phase, we need to collect enough CSI data to serve as the training dataset of our model. In particular, Alice sends a probing packet to Bob at a rate of 10 packets per second during the channel probing stage. When Bob receives the packet, he replies an ACK packet to Alice immediately. We guarantee that the time interval between receiving the corresponding packet at both ends is less than 10 ms, which is much less than the coherence time, so the channel can be regarded constant within this time interval. is process is repeated until we collect enough channel responses. After preprocessing the data according to the above method, we randomly shuffle the dataset, selecting 80% of it for training the proposed deep learning model and the rest 20% for testing. e PyTorch framework is used to implement the proposed neural network, which is trained for 50 epochs using the Adam algorithm. e batch size is set to 128 and the learning rate is set at 1e − 3.
In the key generation phase, we fix the network parameters of the model, and then equip Encoder a on Alice and Encoder b on Bob, respectively. e Decoder and NRLM are only used to ensure that the model can suppress nonreciprocal noise and help encoders learn the main features of channel response during training. is overhead does not exist during the operational phase.
We can intuitively observe whether CRLNet has extracted the reciprocity feature of the channel. e dispersion of channel characteristics throughout 30 subcarriers is illustrated in Figure 4. Figure 5 shows the channel feature processed by CRLNet, which has a high degree of reciprocity.
e collected channel features are converted to a binary key using the uniform quantization approach [2]. Each bit of the key sequence Q can be calculated as where Q i is the i th bit of generated key and x i is the i th element of obtained reciprocal feature. q + and q − are defined as where F −1 is the inverse of the cumulative distribution function (CDF) of Gaussian distribution N(μ, σ 2 ) and μ, σ are the mean and standard deviation value of the produced feature. ε is a quantization factor corresponding to the environment. e values which are quantified to −1 are deleted from the initial key from both Alice and Bob.
For information reconciliation, we can adopt Cascade [33] or BCH code [35] protocol. With regard to privacy amplification, hash functions [37] can be used to convert the key to a fixed length secret key, which can be used for encryption directly. However, only the performance of the initial secret key is evaluated without information reconciliation and privacy amplification to ensure that the comparison is fair.

Performance of Deep Learning Model.
We compare the performance of CRLNet with the following three benchmark models in the four diverse environments (indoor static, indoor mobile, outdoor static, and outdoor mobile): (1) AE [28] is a normal single-branch autoencoder model, whose encoder part and decoder part have the same structure as the CRLNet. (2) FNN [31]: the FNN is a model for UL/DL channel calibration in generic massive MIMO systems. It is a multilayer perceptron with three hidden layers. (3) KGNet [32]: the KGNet is proposed for band feature mapping function for key generation in FDD systems.
Because these comparison models have only single input and single output, these networks function as the mapping between CSI of Alice and CSI of Bob. e network's input is CSI of Alice, and mean square error is used to minimize the difference between the network output and CSI of Bob. e network's input and output are a pair of reciprocal features for these benchmark models. e mean square error (MSE) between the reciprocal features Z a , Z b is utilized to compare the performance of the neural networks, which is described as e MSE indicates the ability of the model to learn the channel reciprocity. Figure 6 shows the MSE comparison results in all testing scenarios. We observe that our model performs better than all other benchmark models in these scenarios, while the performance of AE is worse than that of CRLNet, which means that the NRLM we designed really learned to eliminate the nonreciprocal part from channel response.
To prove the efficiency of the proposed hybrid loss function, we compare the performance of trained model with L 1 and without L 1 loss function. e existence of reconstruction loss L 2 and L 3 loss functions is necessary, because it is used to ensure that the model has really learned the main features of the channel response. As shown in Figure 7, the CRLNet with the L 1 loss function performs excellently, while the model without the L 1 loss function fails Security and Communication Networks 5 to achieve an acceptable MSE. e comparison results verify the irreplaceable role of L 1 for learning channel reciprocity. e task of NRLM is to separate the nonreciprocal part from the input channel response. To study the influence of different network design of NRLM on the resulting reciprocal feature extraction performance, the following experiment was conducted. As shown in Figure 8, we compared original CRLNet and three modified CRLNet equipped with distinct NRLM modules in terms of MSE. e NRLM 1 is the NRLM we proposed above, the NRLM 2 adds an extra hidden layer on the original NRLM, the NRLM 3 increases the number of neurons in each hidden layer of original NRLM, and the NRLM 4 is composed of three 1D convolutional layers. e specific parameters of these modules are given in Table 1. We can see that increasing the hidden layer's depth or width does not bring a meaningful improvement. Using a 1D convolutional layer to replace the fully connected layer can get lower MSE, but it will significantly increase overhead in training phase. erefore, the original NRLM has a good tradeoff between performance and computational cost.

Performance of Key Generation.
We employ different models in the reciprocity feature extraction step to compare the performance of the key generation; the other steps remain the same. We use the following metrics to verify the effectiveness of the initial generated key: (1) Key Error Rate (KER) is defined as the number of conflict bits in the initial keys generated by two devices divided by the total number of the generated bits. (2) Key Generation Rate (KGR) is defined as the number of bits generated by each probing packet. (3) Randomness: the standard NIST test suite [39] is used to measure the randomness of the initial key.
We first compare average KER of initial key of the four deep learning models under testing dataset in different environments. As shown in Figure 9, CRLNet outperforms other benchmark models in terms of KER. When the KER of other models is too high to be used for matched key establishment in practice, a suitably low KER is achieved in all test cases.
Our model and the benchmark model produce reciprocity feature with different sizes; therefore the KGR cannot be directly compared. For a fair comparison, we first use the PCA method to reduce the output features of the three comparison models to the same dimension as the CRLNet and then quantize the features to the initial keys. Figure 10 indicates that CRLNet has the highest KGR under all experimental scenarios.  In Figures 11 and 12, we compare KER and KGR with different quantization factors ε of 0.01, 0.05, 0.1, and 0.15 in all scenarios. e result demonstrates that as the quantization factor increases, both the KER and KGR decrease, implying a performance tradeoff. To acquire the best performance in the real world, we can adjust the quantization factor based on the signal-to-noise ratio (SNR) of the environment.
To verify the sufficient randomness of the generated keys, we perform NIST statistical tests on testing datasets. Table 2 shows the test results in four environments with the quantization factor of 0.01. All the cases pass the test and have the p-value much larger than 0.01, which is the threshold to pass the test.

Impact of the SNR.
In order to prove the generalization performance of the proposed key generation method in complex scenarios, we added artificial Gaussian white noise, which caused the SNR to change from 0 dB to 20 dB. We compared our method with three benchmark methods [28,31,32], in terms of KER versus signal-to-noise ratio. As shown in Figure 13, the CRLNet based key generation scheme outperforms the counterparts in respect of KER.
is demonstrates that the benefit of our method is that it is independent of the channel's statistical properties and can generate a consistent key based on the training dataset in an adaptable and effective manner.

Computational Complexity.
e difference in computational complexity between our method and existing physical layer key generation methods is in the reciprocity feature extraction stage. In our method, there is no information sharing across legitimate nodes during the key generation process. In addition, our feature extraction only requires a single encoder of the CRLNet, which retains a low overhead.
In Figure 14, we compared the average execution time of the feature extraction stage for single packet between the proposed method and the above benchmark methods. e hardware environments we use are as follows: AMD Ryzen 5600X CPU, 16 GB RAM, Windows 10 Home 64-bit operating system. For all comparison methods, we use it to process 20,000 packets and calculate the average execution                  time for each packet. is process is repeated 20 times, and the average value is taken as the final result. As can be seen, our proposed method has the shortest execution time because it only relies on the lightweight encoder network module to extract reciprocal features.

Conclusions
In this paper, we propose a physical layer key generation scheme that can generate consistent keys from imperfect wireless channels, by designing deep neural networks to extract channel reciprocity features. e developed CRLNet can efficiently learn the reciprocity component of channel state information (CSI) in TDD OFDM systems. Based on the CRLNet, we design a complete key generation scheme that performs excellently on commercial WiFi devices. Extensive experiments are conducted under static and mobile environments in both indoor and outdoor scenarios. e results confirm that CRLNet can extract reciprocal channel features more efficiently than the benchmark neural networks in terms of MSE. Furthermore, the CRLNet-based key generation scheme achieves higher KGR, lower KER, and sufficient randomness compared to the existing methods in all testing scenarios.
Data Availability e experimental CSI data used to support the findings of this study have been deposited in the GitHub repository (https://github.com/hehaoyulkeke/csi-data).