A Lazy Learning-Based Self-Interference Cancellation Approach for In-Band Full-Duplex Wireless Communication Systems

We propose a new lazy learning-based cancellation approach to improve spectral e ﬃ ciency for current wireless communication systems, suppress self-interference (SI) sent from base stations, and enable in-band full-duplex (IBFD) transmissions in cellular networks. Our proposed approach consists of two phases based on traditional IBFD systems: an o ﬄ ine phase for database generation and an online phase for data transmission. In the o ﬄ ine phase, the output before a 0/1 decision is premeasured without the desired signal input and recorded in a database with self-de ﬁ ned feature vectors (FVs). In the online phase, a suitable result is sought from the generated database with the help of a learning method and FV for the same system architecture with the desired signal input. The result is then assigned an SI cancellation value. Regular and eager learning-based cancellation approaches are employed to evaluate the proposed method and simulate the transmission output. Computer simulation results indicated that the proposed cancellation methods could achieve about 134dB SI suppression and achieve nearly the same transmission levels as methods with no SI e ﬀ ect, enabling the IBFD operations in wireless communication systems better than the regular and eager learning-based techniques.


Introduction
Wireless communication traffic has been rapidly increasing with the prevalence of smartphone applications and Internet of Things devices. Development of new spectrum resources and improvement of spectral efficiency (SE) to provide proliferating wireless traffic for future mobile communication systems beyond the 5th generation require time, effort, and money [1,2].
This study concentrates on a potentially disruptive technology called in-band full-duplex (IBFD) to increase the SE in wireless communication systems. IBFD systems can double the SE under ideal conditions compared to traditional time-division duplex (TDD) and frequency-division duplex (FDD) systems. Independent transmitting and receiving are performed in TDD systems over a common frequency band (FB), and inversely simultaneous transmitting and receiving are carried out in FDD systems using an independent FB. Consequently, the protocol for simultaneous transmission and reception over the same FB is adopted. However, the quality of the desired signal seriously deteriorates because of the self-interference (SI) effect caused by employing the same FB for transmitting and receiving. Thus, the SE either improves slightly or becomes worse [3].
Currently, various approaches have been proposed in the existing works to suppress SI and enhance SE, such as antenna, analog, and digital cancellation [4][5][6][7][8][9][10][11]. Generally, antenna cancellation is aimed at increasing the isolation between transmission and reception [5]. Analog cancellation is used to suppress the SI power by combining a reference SI signal in which the phase and amplitude are adjusted [6,7,11]. However, because of several physical constraints upon antenna design and inaccuracies in obtaining SI signals in analog circuits, residual SI remains powerful in the desired signal [12]. Therefore, digital cancellation (DC) is introduced to construct a replica of the SI and subtract it from the received composite signals [9].
The desired portion of the received composite signal should be successfully demodulated if its power is tens of decibels larger than the SI power through a SI canceller. Unfortunately, in most common wireless communication systems, the power of the desired signal component is not much greater than system noise power, and the noise power is difficult to suppress. Therefore, the biggest challenge for the design of the SI canceller is to decrease the SI power down to the system noise level and double the SE. For instance, suppose that the considered IBFD operation is used in a common base station (BS) with an equivalent isotopically radiated power (EIRP) of more than 40 dBm and an average noise power of less than -100 dBm at the receiver component. The combined effect of the antenna, analog, and DCs is preferably encouraged to 140 dB or more to suppress the SI power down to the noise level [12]. In fact, a cross-polarization technique yielding antenna cancellation of 50 dB was presented in [13], and a combined analog and DC of 60 dB was proposed in [14]. These exciting results showed about 110 dB combined suppression effect for SI power can be achieved in experimental environments. Unfortunately, probably because of several hardware restrictions such as maximum output power of device and noise floor, currently, there have been few results related to SI cancellation approaches with suppression capabilities over 110 dB to the best of our knowledge.
Furthermore, most previous works such as [6,7,10,15,16] using analog and digital cancellers for IBFD systems are based on regular methods. These methods estimate the SI signal effects experienced in the time and frequency domains, including nonlinearity in the transmitter power amplifier (PA) and low noise amplifier (LNA), wireless channel propagation, and circuit delay. However, since the power gap between the SI and received signals is considerable, the nonlinearity of the PA and LNA, connector return and insertion losses, and other unknown losses are always present in practice. Therefore, a small error in the estimation process may cause large residual SI, and strict measurements are required in the whole system, even though that is challenging to ensure. Consequently, a tiny fracture grows into a chasm.
Several researchers such as in [7,[17][18][19][20][21][22] inspired by big data and machine learning (i.e., two of the most popular and useful technologies [23][24][25]) tried to cancel the effects of SI by (including but not limited to) predicting SI signals or estimating channel state information (CSI) to reconstruct SI waveforms with the help of the deep learning method, which is representative of eager learning (EL) in machine learning techniques. For example, the work of [18] proposed a realtime nonlinear SI cancellation solution using deep learning to realize IBFD wireless communication. In this solution, SI channel is modeled by a deep neural network (NN), and the NN is trained for cancellation of SI at wireless node. The results from their software-defined radio-(SDR-) based testbed showed a performance of 17 dB in DC and yielded an average of 8.5% bit error rate (BER) over many scenarios and different modulation schemes. Similarly, to enable IBFD transmissions, the authors in [19] proposed a nonlinear DC approach by adapting support vector regression which is one of algorithms to solve regression problems in machine learning. Their tests were also performed on SDR-based platform and indicated that their proposal can provide more than 30 dB digital suppression for transmit power levels higher than 20 dBm.
Recently, a joint detection and nonlinear SI cancellation approach using EL with NN was proposed in [20]. In this work, the EL method is used to derive a function between output of desired binary data and received signal, and thus, the desired signal can be directly demodulated in the presence of SI. Although several questions need to be addressed in their future works, the preliminary experimental results showed that EL techniques can perform better comparing to conventional SI cancellation techniques. Moreover, hybrid beamforming design for IBFD millimeter wave systems using EL-based scheme was proposed in [21]. In this work, two frameworks based on extreme learning machine and convolutional NN were presented to design hybrid beamformers and further achieve SI cancellation. Their results showed that both learning-based schemes can provide more robust performance, improve spectral efficiency, and decrease computation time. In additional, an alternative application using EL for IBFD systems was proposed in [22]. The authors in this work introduced a use of EL with NN to accelerate tuning of multitap adaptive radio frequency (RF) cancellers by training the weights of in-phase and quadrature channels in adaptive cancellers. The results illustrated a fast convergence speed by using EL in RF cancellers and thereby enabled IBFD operation in dynamic interference environments.
Generally, in many cases, an eager learner abstracts away from the data during training and uses the trained model (abstraction) to make predictions. The most important benefit of using EL is solving the SI problem; time-insensitive parameters (e.g., PA and LNA nonlinearity as well as antenna return loss) in the entire system can be included in the trained model and do not have to be estimated. However, prediction errors caused by the trained model and estimated CSI for the SI calculation always exist and cannot be avoided because the model training is an essential operation [26]. Thus, errors can unexpectedly be amplified by powerful SI signals similar to the regular method described above.
Inspired by the previous discussion, it can be known that estimating or reconstructing SI signal waveforms for cancellation is unnecessary. However, all practical effects on the desired signal caused by the SI must be quantified and tagged with feature vectors (FVs), and these quantified values can be precisely found by using these FVs when required [27]. That highlights the difference with the model generation-based learning methods such as [17][18][19][20][21][22]. More specifically, the effects of the SI on wireless transmissions can be quantified by constellation values at the 0/1 decision in a signal demodulator; an FV can be defined as a set constructed with the estimated CSI between the interested user, antennas, and transmitted symbol from the BS. Contrary to EL, the lazy learning (LL) method [27,28] does not require 2 Wireless Communications and Mobile Computing model training, thereby the prediction error is not introduced in related operations for SI cancellation and may not be amplified by the powerful SI signals. Therefore, we propose an LL-based cancelation method to solve the SI problem.
We separately perform an offline phase for database generation and an online phase for data transmission. In the offline phase, the output before a 0/1 decision is premeasured without the desired signal input and recorded to a database with self-defined FVs. In the online phase, a suitable result is sought from the generated database with the help of a learning method and FV for the same system architecture with the desired signal input. The result is then assigned as the SI cancellation value. In other studies such as [22], the demodulation of composite signal, including desired signal, thermal noise, and potential residual SI, is independently performed after all of the cancellation processes are done (even the residual SI is still powerful). In our proposal, desired signals are directly demodulated in the presence of SI and thermal noise. We further provide system-level performance such as BER to evaluate the proposed approach in the IBFD systems, contrary to existing works [4][5][6][7][8][9][10][12][13][14][15][16] where most studies only focused on the SI suppression capability.
The remainder of this paper is organized as follows. In Section 2, we describe the system architecture under consideration and formulate the problem. In Section 3, we explain three kinds of DC approaches in greater detail: the proposed technique, regular, and an EL-based method. Details regarding the employed channel models are provided in Section 4. In Section 5, we present and analyze simulated results and summarize the key findings. Discussion and concluding remarks are presented in Section 6 and Section 7, respectively.

System Model and Problem Formulation
2.1. System Model. We consider a unidirectional IBFD system in which an interested user equipped with a single antenna sends desired signals to a BS receiving antenna from a distance of d U , while the BS simultaneously sends signals to other users via the same FB. The BS is equipped with a pair of transmitting and receiving antennas with an interantenna distance of d BS , and the latter one is used to receive a composite signal sent by the BS transmitting antenna (i.e., SI) and the interested user. Analog and digital cancellers are commonly employed in the IBFD BS structure for suppression of SI signals, and references from analog and digital transmitter circuits can provide supports for cancellation processes. An illustration of the considered unidirectional IBFD systems is shown in Figure 1.
In this study, we design a valid frame structure for transmitted signals, including the SI and desired signal, to evaluate the considered systems using the proposed cancellation approach. The structural design includes downlink and uplink aspects used for BS and an interested user, respectively. The downlink and uplink transmissions with N frame frames indexed with i are performed, and there are N symbol modulated symbols indexed with j in the data part in each frame. We assume that frame synchronization at receive side is perfect and pilot signals with N pilot symbols are deployed at the head of the frame to obtain CSI at an arbitrary receiving terminal. An illustration of the frame structure is given in Figure 2.

Problem Formulation.
In this study, the considered IBFD system is employed with the uses of antenna, analog, and DCs. An IBFD BS structure is shown in Figure 3. For jth data symbol in ith frame, the received waveform at BS which consists of desired signal, SI signal, and noise can be expressed as where G ant,rx is antenna gain of BS receive antenna. h Ψ ðiÞ for Ψ = U and Ψ = BS are defined as the channel gain coefficient of user-BS and the coefficient of BS's interantenna corresponding to the ith frame, respectively. We further use ρ Ψ ðiÞ and ϕ Ψ ðiÞ to represent the amplitude coefficient and phase shift of the complex variable h Ψ ðiÞ so that The length of frames is limited by the channel coherence time, and wireless channels are considered static   Wireless Communications and Mobile Computing during this coherence time [29]. Thus, effects due to the channel attenuation on all of the symbols in the same frame are identical and can be estimated using pilot signals. w rx ði, jÞ is the complex additive white Gaussian noise (AWGN) at the receive antenna of BS with an average power of Ω rx .Ĉ ant represents the antenna cancellation without path loss effect of interantenna at BS and further expressed asĈ where L BS represents the path loss between the antennas at BS and C ant denotes the employed antenna cancellation, which can suppress the power of the SI signals.
x Ψ ði, jÞfor Ψ = U or Ψ = BS is complex signal output from the transmit antenna of user or BS and can be expressed as where G ant,tx,Ψ denotes antenna gain for user or BS. Q PA,Ψ ðzÞ is PA function for user or BS with third-order intermodulation distortion. According to [30], Q PA,Ψ ðzÞ can be expressed as where G PA,Ψ denotes PA gain of user or BS, while O 3,Ψ is the third-order intercept point (OIP3) of user or BS PA.
The terms of w tx,Ψ ði, jÞ are the complex AWGN existing in the user or BS modulator with an average power of Ω tx,Ψ , while w PA,Ψ ði, jÞ is denoted as the additional output noises caused by PA of user or BS with an average power of Ω PA,Ψ that can be easily calculated by where F Ψ is noise factors in user or BS PA [31]. The terms of m Ψ ði, jÞ in (4) represent the output complex signal from the user or BS modulator. For the use of a common modulation scheme, m Ψ ði, jÞ can be written as where A Ψ ði, jÞ = ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2E symbol ði, jÞf symbol q and f symbol is symbol rate for the considered modulator, while E symbol ði, jÞ denotes energy of jth symbol in ith frame. θ Ψ ði, jÞ are the corresponding phase angles after symbol mapping processed on binary data which the user or BS needs to transmit. Refer to Figure 3 with switches ON, after an analog cancellation C ana for the SI signals in receive waveform yði, jÞ is done, the CSI of antennas at BS and the CSI of user-BS part are estimated ash for Ψ = U and Ψ = BS, respectively; then, a running of search algorithm for DC is employed even though it is not necessary in common wireless communication systems without IBFD operations. Thereafter, a signal detection method DðzÞ in the detector should be adopted and performed.
The output of the detector as well as the input of demodulator is written as

Wireless Communications and Mobile Computing
After a series of operations in demodulation with the uses of DCs C dig which formed as a complex number, the inputs of 0/1 decision can be expressed as where the real part and imaginary part of y dec ði, jÞ, i.e., Rðy dec ði, jÞÞ and Iðy dec ði, jÞÞ, are used for I and Q channels in 0/1 decision block, respectively. α is an FV constructed by some input parameters for DC that is explained in the following section. Substitute (9) and the related equations given in this section into (10), we rewrite (10) through the necessary mathe-matical simplification as where Sði, jÞ and Iði, jÞ represent the complex constellation values of the desired and SI signals, respectively, and can be calculated by  Figure 3: An IBFD BS structure with switches OFF for the offline phase and ON for the online phase. The offline phase is used for an offline-generated database in which constellation values of SI have been previously measured and recorded. The online phase is used for data transmission in which the desired signals are separated by subtracting the recorded SI from the received composite signals.

Wireless Communications and Mobile Computing
In the current study, a simple matched filter (MF) detection method [32] for DðzÞ is used and can roughly be expressed as For the most common wireless communication systems, λ is assigned byh U ðiÞ; thus, the desired signals can be detected because ofh [32,33]. The complex value Wði, jÞ caused by various noises can be further extended as where W tx,U ði, jÞ, W PA,U ði, jÞ, W tx,BS ði, jÞ, W PA,BS ði, jÞ, and W rx ði, jÞ are defined as the complex values of noises generated from user's modulation and PA, BS's modulation and its PA, and the receive antenna at BS side, respectively, and can be easily calculated by respectively. Actually, to get the desired binary data sent by the user successfully, we strongly hope that the input values of 0/1 decision, i.e., y dec ði, jÞ, are purely decided by the desired part Sði, jÞ. Unfortunately, in IBFD systems, because the strength of SI signal at the BS's receive antenna is very huge compared to the desired signal, y dec ði, jÞ presented in (11) is mainly dominated by Iði, jÞ. That is the reason that we further propose the DC C dig to suppress the influence of SI signals. Define the residual SI asÎ naturally our target is to makeÎði, jÞ close to zero for arbitrary frame i and symbol j as much as possible.

Digital Cancellation Approaches
In this study, we propose an LL-based method to set DC and further enable the IBFD operations in the considered wireless communication systems. Several existing works such as [7,16] have been studied on SI suppression with the help of deep learning, which is a typical representative of EL in machine learning techniques. Other works such as [6] employed a regular method in which the SI signal is approximately calculated using estimated CSI. In this section, we give more explanations for the DCs mentioned above.

Proposed LL-Based Cancellation Approach.
Generally, LL is a learning method where generalization of the trained model is not necessary, and searching of the needed data is delayed until a query is made to the system, as opposed to in EL, where the system tries to generalize the model using training data before receiving queries [28]. The core of the proposed LL-based cancellation approach in our study is constructed by (1) an offline-generated database in which the constellation values of SI at the input of 0/1 decision block are premeasured and recorded and (2) an online data transmission in which desired signals are separated by subtracting the recorded SI from the received composite signals. The details are provided as follows.
3.1.1. Offline Database Generation. We first define a database B which consists of an input space named B in and an output space named B out . According to (13), because SI is further affected by λ and λ is related to the coefficient of user channel for the considered MF detection method; in the input space B in , we define three independent parameters A BS , θ BS , and h U and form an FV. The first and second parameters A BS and θ BS are assigned with amplitude and phase angle used in BS's modulation and can yield discrete values corresponding to the used modulation method. The last parameters h U are assigned with the channel coefficient h U between the interested user and BS, and in theory, its absolute value and phase can range from 0 to ∞ and 0 to 2π, respectively. The values in the input space B in are previously designed to take over all possible combinations of these parameters as much as possible under an acceptable level of computational cost.
To reduce the database size for our learning systems, in the output space B out , we define two parameters which are h BS and I. h BS is used to record the estimated values of channel coefficient h BS and can be considered as a tag of I when we need to locate I from the database B out . I is used to record the measured constellation values of SI. To get the values for the output space B out , a series of provisional transmissions using the designed input space B in needs to be performed, where the provisional transmission corresponding to Figure 3 with switches OFF means 6 Wireless Communications and Mobile Computing transmissions without desired signal inputs, i.e., m U ði, jÞ = 0∀i, j, and DC is not performed at BS. Note that the outputs of channel estimation block in Figure 3 are the estimated CSI of user to BS and the estimated CSI of interantenna of BS, and the value of the former one is replaced by the designed h U in the offline database generation. Intuitively, we first chose a designed FV of input space as where b = 1, ⋯, jB in j, and then use B in ðbÞ to perform the provisional transmission following the considered IBFD system. For this case of b, one of the possible channel gain h BS measured by the channel estimation block is recorded into h BS ðb, b ′ Þ, and at the same time, Iðb, b ′ Þ is used to recorded the input of 0/1 decision, i.e., the constellation value of SI. According to (13) and (15), after averaging the noise effects by pilot signals, in mathematics, the recorded value of Iðb, b′Þ by measurement can be written as A summary for this generation process can be found in Algorithm 1.

Online Data Transmission.
Once the database B is created, we can use it into online transmission for the considered IBFD systems as shown in Figure 3 with switches ON. When the BS prepares to demodulate the desired signals Sði, jÞ transmitted by the user under the effect of SI Iði, jÞ and noise Wði, jÞ; in the block of searching algorithm for DC, the BS first searches an "optimal" FV from input space B in using estimated CSI ofh U ðiÞ and the related modulation information A BS ði, jÞ and θ BS ði, jÞ of the signals the BS sent.
The "optimal" FV is decided by the vector ½A BS ði, jÞ ; θ BS ði, jÞ ;h U ðiÞ and written as where Based on the fact that A BS ði, jÞ and θ BS ði, jÞ for all of i   (25) is the well-known optimization problem of nearest neighbor search (NNS) in LL. The simplest solution to the NNS problem is the so-called linear search which computes the Euclidean distance taking over all of candidates. Other methods such as space partitioning like k-dimensional tree [34] or greedy search [35] can give an exact or approximate solution at a lower computational cost. Some analysis and performance study for NNS problems can be found in [36].
After the "optimal" FV B in ðb * Þ is found, the corresponding set of the potential SI effects in the output space B out is then located as To assign suitable values to the DC C dig ði, j, αÞ, we also consider the estimated CSI ofh BS ðiÞ. The value of Iðb * , b ′ * Þ is finally where jB out ðb * Þj suggests the number of measured CSI between interantenna of BS. Notably, because effects from noise in h BS ðb * , b ′ Þ andh BS ðiÞ have been averaged by using multiple pilot signals, the result of (26) ensured that h BS ðb * , b ′ Þ ≈h BS ðiÞ. In addition, our proposed approaches work with an assumption that a channel realization between the BS antennas h BS ðiÞ can only be tagged by one estimated CSĨ h BS ðiÞ; in other words, there has to be a one-to-one correspondence between h BS ðiÞ andh BS ðiÞ. Generally, this oneto-one correspondence can be guaranteed. Thus, based on the above explanation, the parameter of α in the proposed DC C dig ði, j, αÞ is written as Notably, according to (23), because the interference term Iðb, b′Þ in the output space of database is premeasured using λ = h U ðbÞ in offline database generation, to suppress the SI effects, in online transmission, the parameter λ in the detector process should be set to h U ðb * Þ rather than the commonh U ðiÞ in the proposed DC. So, corresponding to the parameters of A BS ði, jÞ, θ BS ði, jÞ, h U ðiÞ, and h BS ðiÞ, the constellation value of SI, excluding the noise part after using the proposed method in online transmissions, is written as tagged byh BS ðiÞ, whereas the DC value C dig ði, j, αÞ for this case is decided by database B out and is written as  (28) and (29) are almost equal. Consequently, the residual SI of nonnoise partÎði, jÞ can be substantially suppressed. A summary of this process can be found in Algorithm 2. Moreover, according to (12), the desired signal part in the online transmissions thereby becomes to and thus introduces an error caused by j h U ðb * Þ −h U ðiÞj in the demodulation process because λ is commonly assigned byh U ðiÞ. Fortunately, the differences between h U ðb * Þ and h U ðiÞ decrease with an increase in database size. Naturally, the residual signal-to-interference-plus-noise ratio (RSINR) for the ith frame and the jth symbol after using the proposed DC is written as where we considered the results of (28) and (29) as approximate, assumed gains of all of antennas are 0 dBi, all of PAs are linear, and ignored noise effects from the interested user for simplicity of expressions.
The expression of (31) suggests that the SI effect has been suppressed to the first and second parts in the denominator of (31) by the proposed DC. The first part has to be left behind if h U ðb * Þ ≠h U ðiÞ and results to a limitation in RSINR for high desired signal power. The second part caused by the noises in BS's modulation and PA can only be decreased depending on the antenna and analog cancellations (AACs). Our simulation results can provide some evidence for the analysis.

EL-Based Cancellation
Approach. As a comparison object to evaluate our proposed approach, the EL-based cancellation approach should be taken into consideration. The EL used approach needs to train a model using a prepared database, such as the one generated following the description in Section 3.1.1, to predict the constellation value of SI in the 0/1 decision process. Assuming that the generation of database B is complete, we can construct a training set using parameters in B to train the model. The training set includes a part of FVs V and a part of target values T which are designed by respectively. Afterward, a suitable model (or function) M * can be trained by substituting V and T into a given network architecture and performing learning algorithms. Without loss of generality and noting that T are numerical output, we employ deep NN architecture and adopt a regression learning algorithm where the former is one of the most popular networks in EL methods, and the latter is used to train the model M * [37]. Mathematically, M * can be written as where LðzÞ denotes loss function for training process in EL methods.
In fact, the trained model M * is used to estimate constellation values of SI in 0/1 decision, and obviously, the outputs of trained model should be assigned to DC for SI suppression. Therefore, the DC C dig ði, j, αÞ is calculated by where the same α in (27) in the proposed LL-based cancellation approach is used. According to (35), the introduction of error between the real and predicted SI in the 0/1 decision cannot be avoided since both real CSI of user-BS and interantenna of BS cannot be acquired in practice, and there are always errors in the model training process [26]. Moreover, as shown in for RSINR of the EL methods with the same conditions as the proposed method, suppose the residual SI, i.e., the first component in the denominator, still significantly dominates the composite signal power because of amplification effect from G PA,BS . In that case, the demodulation for the desired signals is not optimistic. In the last section, we perform more simulations to evaluate this cancellation approach.

Regular Cancellation
Approach. Since pilot signals are commonly used in each frames sent by users and BS in modern wireless communication system; the real CSI between the interested user and BS h U ðiÞ and the real CSI between the interantenna of BS h BS ðiÞ are estimated byh U ðiÞ and h BS ðiÞ, respectively. According to the SI expression in (13), an estimation-based regular method for assignment of DC C dig ði, j, αÞ is directly calculating the constellation value of SI Iði, jÞ using the estimated CSIh U ðiÞ andh BS ðiÞ. Mathematically, C dig ði, j, αÞ in this regular cancellation approach is assigned as where α is decided by (27). The RSINR of the regular approach can be expressed as based on the same conditions as the proposed approach. This expression indicates that the SI effect, i.e., the first part in denominator, cannot be completely removed because an error of h BS ðiÞ −h BS ðiÞ ≠ 0 has to be introduced, and this error is further multiplied by G PA,BS kh −1 U ðiÞk 2 times. This fact suggests that, basically, only an excellent performance for estimation (e.g., estimation of wireless channel and PA nonlinearity) and good channel condition for the desired signal (i.e., bigger h U ðiÞ) can make sure that IBFD systems 9 Wireless Communications and Mobile Computing with the regular DC work. Surely, we perform more simulations to evaluate this cancellation approach.

Channel Model
Without loss of generality and for simplicity, in this study, we consider composite fading channels with path loss and Rayleigh fading to simulate the attenuation between the interested user and BS, i.e., h U ðiÞ = ρ U ðiÞe jϕ U ðiÞ ∀i. The phase shift ϕ U ðiÞ∀i is modeled as independent and identically distributed (i.i.d.) random variable (RV) and follows uniform distribution between 0 and 2π radians. The amplitude gain ρ U ðiÞ for ith frame can be expressed as where c o is the velocity of light, d U ðiÞ denotes the distance between the interested user and BS for the transmitting of i th frame, and ζ U is the path loss exponent around user [38,39]. The term of r U ðiÞ∀i is also modeled as i.i.d. RV and follows the Rayleigh distribution with the same cumulative distribution function (CDF) expressed as where σ is the scale parameter of the Rayleigh distribution [32]. For the modeling of channel attenuation between the BS's antennas h BS ðiÞ, considering two facts of not complicated surroundings and low building density around BS, further, since a centralized antenna deployment with a limited interantenna distance [40] is usually employed on the traditional BS, the channel varying between transceiver antennas can be assumed to be static. Based on the above assumption and description, the channel gains of h BS ðiÞ for all of frames are thus modeled by the path loss model and expressed as where d BS is the distance between the BS's antennas and ζ BS denotes the path loss exponent around BS.

Computer Simulations
In this section, we evaluate and analyze the proposed LLbased cancellation approach for the IBFD systems with the help of computer simulations. To do that, we first explain how to generate the offline database based on the designed format, which is introduced in Section 3.1. Then, the use of the generated database for online transmission in the considered IBFD systems is described. At last, we present and analyze simulated results with comparisons of the ELbased cancellation approach described in Section 3.2 and the regular cancellation approach described in Section 3.3 and finally summarize the key findings.

Offline Database Generation for Simulations.
In simulations, since quadrature phase shift keying (QPSK) modulations are planned to be used into both user and BS, the dimension for each FV of input space B in in database B can be reduced to θ BS ðbÞ and h U ðbÞ = ρ U ðbÞe jϕ U ðbÞ , where θ BS ðbÞ can be assigned as four possible values in the common QPSK modulation and is written as The term of ρ U ðbÞ is the amplitude gain of the channel between the user and BS, and it can range from 0 to ∞ caused by the Rayleigh fading. In the simulations, a variable range for ρ U ðbÞ is assumed as ½ ρ min , ρ max , and it is calculated by according to Rayleigh CDF, where δ denotes a given probability in the CDF of Rayleigh distribution and can range from 0 to 1. Thereafter, we define Δ am as the resolution of amplitude coefficient ρ U ðbÞ, and thus, there are dð ρ max − ρ min Þ/Δ am e possible values to be assigned, and The term of ϕ U ðbÞ is the phase shift caused by the Rayleigh fading, and it can range from 0 to 2π radians. For the database generation, we define Δ ph as the resolution of phase shift ϕ U ðbÞ, and hence, there are d2π/Δ ph e possible values to be assigned. Mathematically, ϕ U ðbÞ can be written as Based on the above configurations, an input space B in with unduplicated FVs can be generated for our simulations. For measurement and record of the output space B out , based on our channel assumptions, the output space B out is, thus, reduced to I, and can create a one-to-one correspondence with input space B in . The size of B in as well as B out becomes larger with decreases of Δ am , Δ ph , and δ, and thus, more storage for saving and higher compute capability for searching are needed. Certainly, the parameters of ρ min , 10 Wireless Communications and Mobile Computing ρ max , Δ am , and Δ ph are not restricted by the considered channel models. Actually, these parameters can be set freely and independently to fit various hard-or software environments. In addition, in the database generation process, we generate the same database N B times and average them to eliminate the effects caused by the various noises as much as possible.

Model Training for EL-Based Cancellation Approach.
According to Section 3.2, once the above database is generated, we can format a training set which consists of FV and the corresponding target value for model training in the EL-based cancellation approach. A deep NN architecture is employed to train model M * and is shown in Figure 4 in which the input layer and the output layer are used to accept the FV V and target value T , respectively. A Levenberg-Marquardt back-propagation algorithm [41] is used to train node coefficients of all of layers in our feedforward NNs, and an introduction study of deep learning for wireless physical layer can be found in [37]. With a consideration of limited computational cost, we configure a maximum value of epoch for model training in which an epoch can be described as one complete cycle through the entire training dataset. Some essential parameters for model training in our current simulations are listed in Table 1.

Online Data Transmission for
Simulations. Considering 4.6~4.9 GHz is assigned to a 5G-based private network called local 5G in Japan [11], we employ 4.6 GHz as carrier frequency in our simulations. After the database B and model M * are generated, they can be used in online transmissions for the considered IBFD systems. When the desired symbol is received under the effects of SI and noises, the BS first attenuates the received composite signal waveform excluding the desired portion using the AACs C ant and C ana . Then, the searching algorithm is implemented for the proposed cancellation approach following the instructions described in Section 3.1.2 with the help of the generated database B. The SI can also be estimated through the trained model M * for the EL-based cancellation approach, or the estimated CSI can predict the SI for the regular cancellation approach. Subsequently, constellation values of SI found from the database B, predicted by the model M * , or estimated by the regular approach are taken out and subtracted independently before the 0/1 decisions in BS's demodulator. Finally, binary data sent by the user is recovered, and some transmission performances are evaluated and analyzed. Simulation parameters are listed in Table 2.

Simulation Results.
In this subsection, we first list three factors which may affect the performances of the abovementioned DC approaches. They are (1) PA additional noise at BS side, (2) errors in channel estimation, and (3) database size. The additional noise is assumed as generated noise in all PAs, and its existence results noise figures are not one. The errors in channel estimation process occurred if the BS cannot obtain perfect CSI. In our simulations, the estimated CSIh Ψ ðiÞ for Ψ = U or Ψ = BS is generated bỹ where h Ψ ðiÞ denotes the real CSI of the interested user to BS or interantenna at BS side. Δ err,Ψ is defined as channel estimation error and can range from 0 to 1. τ is a random phase ranged from 0 to 2π and follows uniform distribution. The database size can be calculated by (47). Thereafter, we demonstrate simulation results under the effects of these factors, respectively. Finally, we evaluate transmission performances with all of the mentioned factors for the different DCs and show some key findings.
Notably, in our current systems, although all of the signals passing through the PAs are somewhat distorted owing to the PA nonlinearity, estimation errors on PA nonlinearity were not considered in this study. Inaccurate estimation of the PA nonlinearity may substantially degrade transmission performances for the estimation-based regular cancellation approach. However, considering the time-invariant property in all PAs, it seems not to be a major issue for the learningand EL-based cancellation approaches. Nevertheless, our future work may concentrate on estimating PA nonlinearity.
In Figure 5, we demonstrate some comparisons of BER simulated by no SI, the proposed, the regular, and the ELbased DC approaches with variable AACs representing by C ant C ana and fixed average receive signal-to-noise ratio (SNR). The average receive SNR is defined as the ratio of the average power of the desired signal to the average power of noise at the receiver antenna and is mainly dominated by the large-scale fading between the interested user and BS. The channel between the interested user and BS is assumed to be static to evaluate the effect caused by the factor (1), and channel estimation errors for all channels are ignored.

Wireless Communications and Mobile Computing
This figure indicates that the DC mentioned above approaches achieved similar BER performances over the entire range of C ant C ana for a given SNR. The results clarify that factor (1) is not the major reason for the difference among the proposed, the regular, and the EL-based DC approaches. Moreover, there are gaps between BER performances of no SI and that of using DCs when C ant C ana > − 50 dB. This phenomenon occurs because various noises at the BS side generated by PA and modulator are added into SI signals. Because these noises are random and there is nothing to do about them by the DC approaches, AACs had to be used to limit their power strength that can be confirmed by the second term in the denominator in (31), (36), and (38). Consequently, using a larger AACs with C ant C ana ≤ −50 dB decreased the effects from the noises and resulted the similar BER performances as that of no SI effect. In fact, numerous authors have reported that they achieved more than 45 dB SI suppression capability using antenna cancellation (i.e., C ant < −45 dB) [5,13] and more than 30 dB suppression capability using analog cancellation (i.e., C ana < −30 dB) [6,11]. Considering feasibility, in our further simulations, we refer to the results of previous works and conservatively set C ant = −45 dB and C ana = −15 dB to suppress the effects of these noises.
In Figure 6(a), we present comparisons of RSINR versus SNR to demonstrate how factor (2) affects the mentioned DC approaches. The comparisons were obtained employing the proposed, regular, and EL-based DC approaches with variable channel estimation errors, assumption of static channels, and 60 dB of AACs, which suppressed the effect from factor (1). The BER performances of the mentioned DC approaches corresponding to Figure 6(a), with the same simulation conditions, are shown in Figure 6(b) to further verify the effect caused by the factor (2). Figure 6(a) indicates that the RSINR of the regular and EL-based DCs is significantly decreased with increasing of channel estimation error from 10 −5 to 10 −3 over the entire range of SNR; on the contrary, the RSINR of the proposed approach is not affected by the channel estimation error. One of the major reasons for this result is whether the estimated CSIh BS ðiÞ is directly adopted in DC approaches or not. For the proposed DC,h BS ðiÞ works as a tag of SI and is used to help record quantized SI effect into database in offline phase or is used to help search target SI value from database in online phase. In other words,h BS ðiÞ, actually, is not designed to participate in any SI estimation-related computations in the proposal. Naturally, it does not appear in the   (36) and (38). In practice, 10 −3 of channel estimation error is a typical setting and acceptable. However, the RSINR of the regular and EL-based DCs with 10 −3 estimation error cannot meet the general communication needs, and consequently, their BER performance shown in Figure 6(b) is terrible, unless improving measurement accuracy on channel estimation process to 10 −5 . Figures 7(a) and 7(b) exhibit RSINR and BER versus SNR plots simulated by no SI and the three DCs mentioned above under the effects of factor (3) database size. Rayleigh fading is considered in the channel between the interested user and BS to evaluate the effect caused by the factor (3), and perfect channel estimation and 60 dB AACs are assumed. Facing the limited computational cost and storage, we only vary the resolution of phase shift Δ ph to control the database size. Note that since the regular DC approach does not require a database, there is only one curve for the regular approach in figures. Both Figures 7(a) and 7(b) demonstrate that transmission performance (RSINR and BER) when using the proposed approach and EL-based DC approaches can be improved by increasing the database size. The results can be well explained by the big data technology because more potential SI values can be stored in a larger database and thus the probability of finding out an estimated SI value that is more closer to the real value of SI becomes higher. Interestingly, for a given database size, the RSINR of the proposed approach gradually approaches a fixed value with increasing SNR and finally results in a floor in BER. In fact, (31) also indicated that in a large SNR range, for example, a large G PA,U , RSINR of the proposed approach is not linear growth with SNR unless h U ðb * Þ = h U ðiÞ. However, the RSINR and BER performances are much better than the EL-based DC approach because the latter introduced an unwanted error caused by the model training, and this error was amplified by BS's PA according to the first term in the denominator in (36). In addition, the estimation-based regular approach shows better performances than the proposed approach and the ELbased DCs because channel estimation errors, i.e., factor (2), in these simulations are not considered.
Finally, to evaluate a mixed effect of factors (1), (2), and (3), in Figures 8(a) and 8(b), we present some comparisons of RSINR and BER simulated by the proposed, the regular, and the EL-based DC approaches over fading channel with a common channel estimation error Δ err,BS = Δ err,U = 10 −3 , a given database size Δ ph = 0:1π, and 60 dB of AACs. Under the consideration of the mixed effects of factors (1), (2), and (3), in Figure 8 (1): comparisons of BER simulated by no SI, the proposed, the regular, and the EL-based DC approaches with variable C ant C ana and fixed average receive SNR. Channels are assumed to be static, and channel estimation errors are ignored. desired signal power of −70 dBm compared to noise power of −100 dBm), the proposed approach with Δ ph = 0:1π approximately improved the RSINR to 24 dB with −94 dBm of residual SI power. Considering 40 dBm of EIRP at the BS, the figure clearly indicates that about 134 dB of SI suppression capability is achieved by the proposed approach with 60 dB of AACs, and Figure 8(b) thereby claims that our proposed approach can enable IBFD operations in the considered wireless communication systems and, consequently, can double the SE approximately.
Conversely, there are considerable RSINR and BER gaps between no SI and the regular or EL-based DC approach. The main reasons may be rough channel estimation and insufficient number of hidden layers and nodes in the network architecture. In fact, enhanced transmission performance for the regular DC can be achieved through strong AACs with highly accurate and comprehensive estimation operations. Similarly, applying strong AACs with highly extensive databases or nodes and layers in the NN architecture may improve EL-based DC performance. Therefore, although the performances of the regular and EL-based DC are inadequate under the current simulation setting, we do not deny the two potential DC approaches for IBFD systems. Instead, we strongly encourage exploring them, especially, for the EL-based DC approach, by, for example, using more faster learning algorithms, reducing dimension of the database, and upgrading hardware devices. Because the proposed LL-based DC approach for IBFD wireless systems can also benefit a lot from technological advance in machine learning-related research fields. Certainly, any type of DC approach should be combined with the outstanding AAC techniques; thereby, improving the existing AACs and solving related issues for the above-mentioned three DC approaches should be studied.

Discussion
Although our simulation results for SI cancellation are promising, the proposed LL-based DC approach is still needs to be evaluated using more hard-and/or software environments for the practical applications, and there are several works that need to be addressed.
For example, the proposed approach in the current simulations is compatible with a signal carrier transmission system. For the popular orthogonal frequency division multiplexing (OFDM) which is adopted in most modern communications standards, multiple databases should be independently generated for each subcarrier because of frequency selective characteristics in wireless channels. Since creating and searching operations on a large number of     databases must be considered, a lots of challenges on learning algorithm and calculation ability still need to be overcome before widespread use of the proposal.
Moreover, it seems straightforward to extend our proposal to the high-order constellations such as 256 quadrature amplitude modulation (QAM) by completely using the designed amplitude term A BS and phase term θ BS in database (actually, A BS in the database is not used for the current QPSK modulation-based simulations). However, more advanced modulation means more combinations of amplitude and phase. Hence, a very large database also introduced a huge challenges on learning algorithm and calculation ability in this case.
Finally, in the present simulations, we assumed static channel between transceiver antennas on BS side. In fact, in a real system, the channel is time-varying. Similarly, the proposed approach can work for this situation by activating h BS in the designed database (actually, h BS is not used in the current simulations because of static channel). However, a large database caused by the dimension increase is still one of the unavoidable problems, and it is unclear how the complexity will increase in this case.
Based on the above description, the biggest challenge on the learning-based proposal is how to handle the huge amounts of data generated from the systems with OFDM operations, high-order constellations, and complex propagation environment. In the future, one of our major works is to solve this problem by introducing more advanced machine learning-related algorithms and technologies. Furthermore, we focus on conducting multiple evaluations for the proposal based on hard-and software environments, for example, using a SDR platform.

Conclusion
In this study, we proposed an LL-based DC approach for SI suppression and enabled IBFD transmissions to improve SE for current wireless communication systems. An offline and online phases for database generation and data transmission, respectively, are performed separately. In the offline phase, the output before the 0/1 decision is previously measured without the desired signal input and is recorded to a database with self-defined FV. In the online phase, a suitable result is located from the generated database with the help of the learning method and the FV usage for the same system architecture with desired signal input; the result is then assigned as the value of SI cancellation. An estimation-based regular and an EL-based DC approach are employed to simulate the transmission performance and evaluate the proposed approach. The simulation results signify that the proposed method could achieve about 134 dB SI suppression capability, BER performance comparable to zero SI, and enabled IBFD operations in wireless communication systems, superior to the aforementioned approaches.

Data Availability
The data including simulation configurations, parameters, and results used to support the findings of this study are included within the article.

Disclosure
Partial content of this study has been previously presented in [27] as conference.

Conflicts of Interest
The authors declare that they have no conflicts of interest.