Performance Improvement on Nonorthogonal Multiple Access without CSIT

In this paper, a downlink virtual-channel-optimization nonorthogonal multiple access (VNOMA) without channel state information at the transmitter (CSIT) is proposed. +e novel idea is to construct multiple complex virtual channels by jointly adjusting the amplitudes and phases to maximize the minimum Euclidean distance (MED) among the superposed constellation points.+e optimal solution is derived in the absence of CSIT. Considering practical communications with finite input constellations in which symbols are uniformly distributed, we resort to the sum constellation constrained capacity (CCC) to evaluate the performance. For MED criterion, the maximum likelihood (ML) decoder is expected at the receiver. To decrease the computational cost, we propose a reducedcomplexity bitwise ML (RBML) decoder. Experimental results are presented to validate the superior of our proposed scheme.


Introduction
Nonorthogonal multiple access (NOMA) is a promising technique to meet the demands of massive connections, extreme data rates, and low latency for the fifth-generation (5G) communication systems and beyond. Power-domain NOMA assigns different power levels to users according to their individual channel gains [1,2]. Consequently, user scheduling and power allocation (US-PA) are crucial issues for NOMA deployment assuming channel state information at the transmitter (CSIT) is obtained [3,4]. Without CSIT, powerdomain NOMA has no idea how to group users and allocate the specific power among users. For Internet of ings (IoT) scenario with massive connections, the computing capacity for most devices is very limited and power-constrained [5]. ese devices might not be able to handle the instantaneous feedback of the channel condition and the consequent US-PA scheme accordingly. On the contrary, for some scenarios with high mobility of transmitters or receivers such as unmanned aerial vehicle communication (UAVC) and high-speed railway communication (HSRC), the knowledge of channel state information (CSI) might be unavailable due to the dynamical and rapid changing of the radio channel [6,7]. In these cases, conventional NOMA fails to function without CSIT. erefore, efforts must be devoted to the new schemes which are robust with respect to limited knowledge of the channel state.
In this study, we propose a virtual-channel-optimization NOMA (VNOMA) to address the no-CSIT scenario based on our prior work [8]. e idea is to jointly optimize the amplitudes and phases of multiple complex virtual channel vectors by maximizing the minimum Euclidean distance (MED) of the superposed constellation. We derive the optimal solution for quadrature phase-shift keying (QPSK) modulation without the knowledge of CSIT. Considering practical communications with finite input constellations in which symbols are uniformly distributed rather than a Gaussian distribution, we resort to the sum constellation constrained capacity (CCC) to evaluate the performance. Furthermore, we propose a low-complexity maximum likelihood (ML) decoder to significantly reduce the decoding cost assuming the CSI at the receiver (CSIR) is obtained via channel estimation. Numerical results are presented to validate the performance of our proposed scheme for Rayleigh fading channel.

System Model
A hybrid downlink NOMA [9] is considered. All served users are divided into different groups, and NOMA is implemented among users in one group. Each group consists of two user equipments (UEs). e kth user's transmitted signal s k (k � 1, 2) choosing from a finite constellation S k with equal probability goes through a corresponding virtual channel w k � A k e jθ k , where A k is the amplitude and θ k is the phase of the virtual channel, respectively. We assume the transmission power is P and E[|s k | 2 ] � 1, and then the transmitted signal at the base station (BS) can be expressed as e received signal at UEk is where Δθ � θ 2 − θ 1 , h k is the Rayleigh fading channel for user k, and z k ∼ CN(0, σ 2 k ) is the additive circularly symmetric complex Gaussian (CSCG) noise with zero mean and variance σ 2 k , respectively. A 1 and A 2 are subject to the normalized power constraint A 2 1 + A 2 2 � 1. Generally, z 1 and z 2 have the same distribution, so we assume σ 2 1 � σ 2 2 . At UEk, we employ the joint ML decoder. e decoded symbols (s 1 , s 2 ) can be formulated as e decoding performance is determined by the squared Euclidean distance (SED) among the points of superposed constellation at UEk. e SED of two superposed symbols at UEk with the channel gain h k is given by where Δs 1 � s 1 − s 1 ′ and Δs 2 � s 2 − s 2 ′ , respectively. s 1 , s 1 ′ ∈ S 1 , s 2 , s 2 ′ ∈ S 2 , and (s 1 , s 2 ) ≠ (s 1 ′ , s 2 ′ ),S 1 and S 2 are the constellation sets adopted by UE1 and UE2, respectively. Since P|h k | 2 only scales D 2 (x 1 , x 2 ) and it can be normalized once h k is obtained via channel estimation at UEk, we focus only on the equivalent distance D 2 eq (A, Δθ) eq (A, Δθ) relies only on A and Δθ for given constellation sets regardless of channel gains and θ 1 ; then, we can jointly optimize A and Δθ at the BS by maximizing the minimum SED between x 1 and x 2 . We have the following equation: where θ max is the maximum value of Δθ. At high SNR, the decoding error probability is dominated by the value of MED ([10], ch5). By maximizing the minimum D 2 eq (A, Δθ), the optimal decoding performance can be guaranteed.

Performance of the Sum Constellation Constrained
Capacity. In general, two users are using finite input constellations in practical application, so we resort to the sum constellation constrained capacity (CCC) [13,14] to compute the overall throughput. e CCC formulations for two users with the ML decoder are described in (9) and (10), respectively. And the overall sum CCC is given by 2 Wireless Communications and Mobile Computing where d(s 1 ′ , s 2 ′ , n 1 , n 2 , m 1 , m 2 ) � s 1 ′ (n 1 ) + s 2 ′ (n 2 ) − s 1 ′ (m 1 )− s 2 ′ (m 2 ), s 1 ′ � A 1 s 1 , s 2 ′ � A 2 s 2 e jΔθ , E(·) is the expectation of a random variable, and N 1 and N 2 are the size of two input constellations S 1 and S 2 , respectively. We assume two users have the same noise distribution and σ 2 1 � σ 2 2 � WN 0 , where W is the bandwidth and N 0 is the power spectral density of the noise. C sum (A 1 , A 2 , Δθ) denotes the overall sum capacity of two users.

Design of the Low-Complexity Decoder.
Since the optimal solution is independent of θ 1 , then we can set θ 1 � 0 to simplify the implementation at the BS side. For this maximum MED criterion, ML is expected to decode the two UEs' signals at the receiver. However, the ML decoder suffers from high computational complexity. When our optimal solution (A, Δθ) opt � ([0.789, 0.211], 15°) is adopted and h k is obtained via  channel estimation [15], the normalized received signal from (2) is given by For the received superposed constellation with the optimal solution setting, x � A 1 s 1 + A 2 e jΔθ s 2 remains unchanged. And all distances for nearest neighbour point pairs are equal to D opt . So, the search radius for the ML detector can be limited to D opt /4, i.e., ‖y k − x‖ 2 ≤ D opt /4 . If a valid point x, i.e., symbol pair (s 1 , s 2 ) , cannot be found within this radius, we will resort to the exhaustive search which is the conventional ML. Furthermore, the conventional ML decoder separates each UE's symbol first and then demodulates the symbol to recover the corresponding bits. For our proposed scheme, unlike the conventional NOMA in [1,3], the order of two UEs' symbols for the superposed constellation does not vary with the channel in our scheme; it only depends on A 1 and A 2 . e mappings of symbol-to-bit are fixed as well; thus, we can directly output two UEs' bits bypassing the processing of symbols mapping as long as the symbol pair (s 1 , s 2 ) is found. In this way, we can further reduce the complexity. Specially, we denote our scheme as the reduced-complexity bitwise ML (RBML) decoder. is proposed decoder can decrease the computational cost both by reducing the searching region and by bypassing the symbol-to-bit mappings.

Experiment Setup and Simulation Results
In order to evaluate the performance of our proposed scheme in the absence of CSIT, we compare the sum CCC of our proposed method according (11) to that of powerbalanced NOMA in [8] and that of TDMA. h 1 and h 2 are randomly generated Rayleigh fading channels. We adopt the 16-point Gauss-Hermite quadrature to approximate the expectation in (9) and (10). e values of ergodic sum CCC are shown in Figure 2. All the numerical results are obtained by averaging over 500 channel realizations. It can be seen that our VNOMA scheme presents capacity improvement comparing to the power-balanced NOMA (termed as NOMA-PB) and TDMA. Furthermore, we compare our scheme with the multidimensional constellations for orthogonal transmission in [16]. As we can see from Figure 2, our proposed scheme presents the same performance as 16-2D constellation. However, a closed-form solution is derived for our scheme which avoids the time-consuming iterative optimization as in [16].
In Figure 3, we take NCMA and TDMA as our benchmark as well since NCMA is also proposed for addressing the scenario without instantaneous CSIT [6]. To compare with NCMA, we employ a [133, 171] 8 convolutional encoder with code rate r � 1/2 for two UEs before superposition coding. e signal-noise-ratio is defined as SNR � P/WN 0 assuming W � 1 Hz. e power for each user in TDMA is 2P which is equal to the total power in VNOMA and NCMA. e normalized decoding packet throughputs and normalized decoding time for NCMA, TDMA, and our proposed scheme are shown in Figures 3   and 4, respectively. In Figure 3, both ML decoder and RBML decoder for VNOMA are demonstrated. As can be seen from Figure 3, VNOMA outperforms the NCMA scheme since our scheme takes the advantage of maximizing the MED of the superposed symbols. ML and RBML have the same performance while RBML shows much lower decoding complexity as shown in Figure 4. However, both NCMA and VNOMA can reach 2 packets per channel and are superior to TDMA even without CSIT. In Figure 4, the normalized decoding time (T X /T TDMA ) for NCMA and VNOMA with the ML decoder (VNOMA-ML) as well as the RBML decoder (VNOMA-RBML) is presented at 10 dB, 15 dB, and 20 dB in terms of packet number, where T X and T TDMA represent the processing time (consists of NOMA decoding and channel decoding) to decode one packet for X (X ∈ NCMA, ML, RBML { }) and TDMA, respectively. Our  proposed VNOMA-RBML has the lowest complexity while NCMA shows the highest computational complexity. is is because NCMA adopts the maximum a posteriori (MAP) algorithm which needs to calculate symbols' marginal likelihoods in order to decode two UEs' signals. e average cost reduction ratio (ACRR) of VNOMA-RBML versus NCMA is η � (1/N) N i�1 (T i NCMA − T i RBML )/T i NCMA � 0.926, where N � 500 is the number of packets. Meanwhile, the ACRR of RBML versus ML is η � 0.892. Obviously, our scheme presents much lower decoding complexity without compromising the performance. Since our scheme just reduces the decoding complexity while remaining the same performance as the classical ML decoder, hence, our proposed decoder has the same throughput as the ML decoder as shown in Figure 3.

Conclusion
We propose a low-complexity VNOMA scheme to address the scenarios where CSIT is not available or not applicable. A setting of optimal solution is derived to improve the performance without the knowledge of CSIT. A RBML detector is proposed to significantly decrease the decoding cost. Numerical simulations demonstrate that our proposed scheme can achieve a better performance as compared with the NCMA scheme. ough only QPSK constellation sets are exemplified, our scheme is also suitable to other constellation pairs such as (4, 16 e number of all possible Euclidean distance values is MN × (MN − 1). It is challenging to obtain the analytical solution for numerous values. Fortunately, some optimization algorithms such as particle swarm optimization (PSO) [17] can be exploited to search for the optimal solution without CSIT before transmitting. Our proposed VNOMA can enable the NOMA technique to be deployed in UAVC and HSRC scenarios where CSI might be challenging to obtain, as well as IoT scenario where devices' computing capacity is limited and the instantaneous power allocation is not applicable.
Data Availability e datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.