Research and Implementation of Rateless Spinal Codes Based Massive MIMO System

The potential performance gains promised by massive multi-input and multioutput (MIMO) rely heavily on the access to accurate channel state information (CSI), which is difficult to obtain in practice when channel coherence time is short and the number of mobile users is huge. To make the system with imperfect CSI perform well, we propose a rateless codes-aided massive MIMO scheme, with the aim of approaching themaximum achievable rate (MAR) as well as improving the achieved rate over that based on the fixed-rate codes. More explicitly, a recently proposed family of rateless codes, called spinal codes, are applied to massive MIMO systems, where the spinal codes bring the benefit of approximately achieving the MAR with sufficiently large encoding block size. In addition, the multilevel puncturing and dynamic block-size allocation (MPDBA) scheme is proposed, where the block sizes are determined by user MAR to curb the average retransmission delay for successfully decoding the messages, which further enhances the system retransmission efficiency. Multilevel puncturing, which is MAR dependent, narrows the gap between the system MAR and the related achieved rate. Theoretical analysis is provided to demonstrate that spinal codes with the MPDBA can guarantee the system retransmission efficiency as well as achieved rate, which are also verified by numerical simulations. Finally, a simplified but comparable MIMO testbed with 2 transmit antennas and 2 single-antenna users, based on NI Universal Software Radio Peripheral (USRP) and LabVIEW communication toolkits, is built up to demonstrate the effectiveness of our proposal in realistic wireless channels, which is easy to be extended to massive MIMO scenarios in future.


Introduction
Massive multi-input and multioutput (MIMO), achieving high spectral efficiency and low power consumption, has been widely regarded as a promising technique for 5G wireless communication systems.However, its benefits rely heavily on the accuracy of the channel state information (CSI) to perform the multiuser procoding.Unfortunately, the collection of accurate CSI is costly because of the short channel coherence time and the huge number of mobile users.In fact, pilot contamination is an essential factor to result in imperfect CSI, and pilot contamination appears to have a more profound effect than classical MIMO [1,2].Therefore, a critical question is how to improve the throughput performance of massive MIMO systems under imperfect CSI.Traditional fixed-rate codes, such as LDPC [3] and Turbo codes [4], may suffer from the significant throughput loss resulting from the rate mismatching under inaccurate CSI.Therefore, developing a resilient transmission scheme for massive MIMO in the presence of imperfect CSI is of great importance.
In fact, for multiuser cellular systems, where the base station is not equipped with a large number of antennas, it has been proved that rateless codes perform well under inaccurate CSI in [5,6].Therefore, rateless codes based transmission scheme for massive MIMO is considered in this paper.When CSI is not available at transmitter, rateless codes with adaptive code rates perform well; some related works about rateless codes, such as spinal codes, strider codes, and raptor codes, have been presented in [7][8][9].In [7], the authors proved that spinal codes outperform LDPC codes as well as strider and raptor codes in fading channels with inaccurate CSI.Therefore, spinal codes are integrated into resilient transmission scheme, where spinal encoders encode users' original messages into multiple infinite streams of symbols, which are transmitted continuously through numbers of (re)transmissions, called passes, until the senders receive the acknowledge (ACK) indicating the successful decoding, from the receivers.Thus, the system will benefit from the multiretransmission diversity and achieves a robust performance.
Once the encoding block sizes for spinal codes are allowed to be sufficiently large, users will gradually approach their MAR with maximum-likelihood (ML) decoding algorithm [10].
For classical MIMO, nonlinear precoding techniques, such as dirty-paper-coding (DPC) [11], vector perturbation (VP) [12], and lattice-aided [13], have better performance.However, with antennas increasing at the base station in massive MIMO, linear precoders, such as zero forcing (ZF) [14], have limited throughput loss compared with nonlinear precoders [15].ZF has low complexity and is more practical for massive MIMO.Therefore, ZF is considered in our work to mitigate multiuser interferences.
However, in order to guarantee the system retransmission efficiency (measured in bits per symbol per second), reducing the retransmission delay, which is equivalent to reducing the pass number, should be considered before spinal codes are integrated into massive MIMO.Reference [16] studied this problem from the link-layer; however, user-schedule scheme involved in the link-layer will bring some extra complexity.To that end, an easily implemented scheme in the physiclayer is proposed in this paper.From [10], it is known that the pass number is proportional to the block size while being inversely proportional to the maximum achievable rate (MAR).Therefore, once we initialize the encoders for all users with a large enough static block size for MAR-approaching purpose, the users with lower MARs, locating in the celledge or trapping in a deep fading channel, will significantly enlarge the average pass number for decoding, which further degrades the system retransmission efficiency performance.Because of this, we developed a multilevel puncturing and dynamic block-size allocation (MPDBA) scheme, where the MARs are obtained as a priori knowledge to determine the block sizes dynamically, which can reduce the pass number as well as retransmission delay.Different puncturing method is implemented for different MARs, which is proved to guarantee the system retransmission efficiency and achieved rate.The numerical simulation results show that spinal codes with MPDBA can make massive MIMO with imperfect CSI work reliably with efficient retransmission.
Moreover, in order to make the spinal codes based massive MIMO be practical from theory, we also consider building up a system with 2 transmit antennas and 2 singleantennas users, where NI USRP and LabVIEW communication toolkits are involved as the hardware and software platforms, respectively.In this implementation demo system, the retransmission efficiency by our proposal is verified in the fading environments.This demo can also be fast extended to massive MIMO scenarios by massive hardware scale expansion.
The rest of this paper is organized as follows: Section 2 presents the system model.The efficient transmission scheme for the system with spinal codes is proposed in Section 3. Section 4 shows some simulation results to illustrate the benefits of the proposed scheme, Section 5 presents some details about the USRP based spinal codes MIMO demo system.Finally, conclusions are drawn in Section 6.

System Model
Throughout this paper, uppercase boldface letters are used to denote matrices.(•)  and (•)  represent the Hermitian transpose and transpose, respectively.[•] denotes the expectation operator.Tr(•) stands for trace, [•] is the round to integer operator, and ⌈•⌉ is the round up to integer operator.
A  ×  downlink massive MIMO system is shown in Figure 1, where the single-cell system has a base station with  antennas and  single-antenna users,  ≫ .For user ,  = 1, 2, . . ., ,   is used to denote an unexpected message.Spinal encoder [7], illustrated in Figure 2, divides   , denoted by message ,  = 1, 2, . .., into  blocks with size   () bits; hash function uses each block to generate a sequence of spine values; then the modulated symbols are yielded by a mapper and pseudorandom number generator (RNG) function, which uses spine values as its seeds.After encoding, ZF maps the modulated symbols to the precoded symbols, which are transmitted over fading channels.If decoding happened successfully, the receiver will send the ACK to the corresponding sender through the feedback channel; otherwise, the sender will generate more redundant symbols to transmit in the following passes until receiving the ACK.Given   () ≥ 1 is the pass number for user  to successfully decode the message .At th pass, for  = 1, 2, . . .,   (), all users' modulated symbols are mapped to a  ×  matrix of symbols S  = (S 1 , S 2 , . . ., S  ).S  = ( 1 ,  2 , . . .,   )  , for  = 1, 2, . . ., , follows Gaussian distribution and Tr(S  S   ) =   ,   is total transmit power.Assuming CSI is prior knowledge obtained by channel estimation methods at transmitter, and downlink MIMO channel remains ergodic and stationary during  symbols periods, S  is then precoded to transmit signal X  ∈ C × by where W  = Ĥ  ( Ĥ Ĥ  ) −1 denotes the ZF precoding matrix for th pass, Ĥ ∈ C × denotes the estimated channel matrix, and   = √/Tr(W  W   ) is power control factor used to normalize Ŵ such that Tr(X  X   ) =   .X  is then transmitted through a MIMO fading channel, and the received signal matrix Y  ∈ C × is given by where G  denotes the MIMO channel matrix, G  ∈ C × and D  ∈ R × represent the complex smallscale and large-scale fading coefficients matrix, respectively.Assuming that G  has zero mean and unit variance independent and identically distributed (i.i.d.) complex Gaussian entries, large-scale fading coefficients, denoted as D  = diag( 1 ,  2 , . . .,   ), are the same for different antennas but user-dependent.N  ∈ C × denotes the noise matrix with i.i.d.complex Gaussian random variables with zero mean and unit variance.Then the average transmit SNR is   /.
If perfect CSI is available at transmitter, then Ĥ = H  .However, imperfect CSI always arises in any practical estimation schemes.According to the channel estimation model described in [14], Ĥ can be given by where ΔH  ∈ C × denotes the channel estimation error matrix, with zero mean and unit variance i.i.d.complex Gaussian random variables uncorrelated with that of H  , and   is a nonzero parameter to measure the quality of channel estimation and is appropriately chosen depending on the channel dynamics.Since the focus of this paper is to study the performance of spinal codes,   is assumed to be known at transmitter.Based on (3), the ZF precoding matrix can be expressed as substituting ( 4) into (2), the th symbol vector at th pass is given by where   ΔH  Ĥ  ( Ĥ Ĥ  ) −1 S  is the additional interference item brought by the channel estimation error.
Once  =   (), spinal decoder for user  collects enough mutual information to retrieve the message  successfully [7].Then the user current data rate (measured in bits per symbol) is achieved by based on (6), the average achieved rate of spinal codes based massive MIMO is given by (5), assuming that   is the average signal power for the desired symbols   ∈ S  , we have as the received noise, the covariance for N can be proved to be where   = diag( 1 , . . .,   ) is the singular value of Ĥ Ĥ  and is approximated by Proof.Assuming that  ≫  in massive MIMO system, let the singular value decomposition (SVD) of matrix Ĥ Ĥ  be Ĥ Ĥ  = U    V   , where U  and V  are unitary matrix and   is a diagonal matrix.
where we use the fact that Based on ( 8) and ( 9), the average MAR for user  is given by Thus, the upper bound of ergodic achievable rate for massive MIMO with ZF under imperfect CSI is expressed as Proof.From [10],   () can be expressed as Substituting ( 12) into ( 6), the corresponding data rate is rewritten as Based on the fact that This lemma indicates that when block size   () =  max , where  max denotes the sufficiently large block size in this paper, the gap between   () and  upper  can be arbitrarily small such that each user can approach  upper  with   (), and spinal codes based massive MIMO approaches  upper as well.
When the small large-scale fading coefficients   deteriorate  upper  in (11), based on (12),   () =  max will enlarge   () significantly.If each pass occupies the constant time  (measured in seconds), then the larger   () makes the average retransmission delay spent on decoding the message  be costly, and the average system retransmission efficiency, defined as will be limited as well.We use   () =   ()/  () to denote current retransmission efficiency for each user.The greedy idea is an effective way to achieve a preferable system retransmission efficiency across different SNRs.That is, we try to make sure that   () for each message  is better, which will further achieve a satisfied value for .When   () is ideal, to decrease   () is an effective way to enhance   ().Therefore, based on (12),   () can be dynamically determined by  upper  to reduce   ().

Multilevel Puncturing and Dynamic Block-Size Allocation Scheme
In this section, a multilevel puncturing and dynamic blocksize allocation scheme is proposed.For each user, the dynamic block-size allocation problem is formulated as where  *  () is the desired block size for message .However, we cannot obtain the solution for (17) by solving (12) directly, which includes an unexpected constant item  (1).Instead, we use the cumulative distribution function (CDF) of pass numbers   () to determine  *  ().Figure 3 presents the CDF of pass numbers corresponding to the different block sizes and SNRs, from which we obtain that, under different SNRs, smaller block sizes guarantee the system to decode successfully with less pass number and high probability.Therefore, we choose for each spinal encoder for encoding the message .When   () =  *  (), according to the proof of Lemma 1, the gap between the MAR and the achieved rate is given by

Wireless Communications and Mobile
which indicates that the dynamic block-size scheme will cause an unavoidable achieved-rate performance loss at higher  upper  .To reduce the loss as much as possible, a multilevel puncturing method is implemented for spinal codes.
Assume message  with  bits is encoded by spinal encoder with desired encoding block size   () =  *  () and yields  = /  () coded symbols.If these  symbols transmitted in the first pass, that is,  = 1, cannot be used to retrieve message successfully, then, from the second transmit pass, that is,  > 1, each  symbol is transmitted within  sub, () subpasses with /  () sub, () symbols transmitted at each subpass.
When  =  sub, (),  bits can be retrieved successfully by /  () + (  () − 1)(/  () sub, ()) symbols, and the user  current rate in (6) can be rewritten as where  sub, () ∈ {1, 2, . . ., } and is determined according to the practical cases.Based on ( 27) and (28), there exist  4 and  5 such that the minimum achieved data rate and maximum pass number can be given by  This theorem indicates that for the transmission scheme in massive MIMO with MPDBA based spinal codes, we can choose a proper  sub, () (i.e., lower  sub, () for lower  upper  and higher  sub, () for higher  upper  ) for different  upper  such that the gap between the achieved rate and the MAR is limited as well as better retransmission efficiency is guaranteed under different  sub, ().

Simulation Results and Analysis
In this section, some numerical simulation results are illustrated to verify the performance of MPDBA based spinal codes scheme for a 4 × 64 MIMO system,  max = 20.
Figure 4 compares the average achieved-rates of massive MIMO based on spinal codes and LDPC codes, respectively.LDPC codes are from 802.16e, decoded with a powerful decoder.We can obtain that spinal codes outperform LDPC codes across all SNRs.Moreover, the simple decoder structure of spinal codes also avoids the demapping complexity caused by different constellations mapping operations of LDPC codes.
In Figure 5, we compare the average achieved-rates of spinal codes based massive MIMO with   () =  max and   () =  *  (), respectively.Massive MIMO can approach its MAR gradually by enlarging the block size, which verifies the conclusion in Lemma 1.We also obtain that the achievedrate performance of spinal codes with   () =  *  () follows conclusions in Theorem 2; that is, larger  upper  will cause bigger achieved-rate performance loss, especially for lower  sub, ().
Figure 6 shows that we can enlarge  sub, () to minimize the gap between the MARs and the rates achieved by spinal codes with MPDBA.For lower  upper  , smaller  sub, () can achieve a satisfying achieved-rates result.For higher  upper  , a larger  sub, () is necessary for spinal codes to achieve the MAR.
Figure 7 shows that the application of MPDBA for spinal encoders will enhance the system retransmission efficiency.Different proper  sub, () for the MPDBA based spinal codes can achieve preferable retransmission efficiency, which also indicates that once the system achieves better retransmission efficiency, there will be more proper  sub, () to choose from to obtain better achieved-rates performance illustrated in Figure 6.The numerical results in Figures 6 and 7 verify Theorem 2; that is, spinal codes with MPDBA can guarantee the system reliability as well as retransmission efficiency.

Simplified Spinal Codes Based MIMO Demo System with NI USRP 2920
This section will present some details about making spinal codes based massive MIMO practical from theory.Some works about implementation of massive MIMO prototyping  system have been presented in [17], where large-scale antennas are used to verify the significant gains in data rates.A rapid prototyping system architecture is proposed in [18], where the proposed system has high scalability in terms of the number of antennas.In order to verify the performance of spinal codes based MIMO with imperfect CSI, to equip large-scale antennas is unnecessary and also too expensive to achieve.Therefore, we first only consider a system with 2 transmit antennas and 2 single-antenna users as a simplified and comparable case of the massive MIMO.We use NI Universal Software Radio Peripheral (USRP, here we use USRP 2920 as the basic module) and LabVIEW communication toolkits to build up the testbed, which is used to verify the benefits presented in our theoretical analysis.Owing to the decouple design of software and hardware, the simplified MIMO can be easily deployed and expanded to a massive MIMO system with little extra operations.

Simplified MIMO System
Architecture.The MIMO system architecture is shown in Figure 8.In our demo system, specific MIMO cables are used to synchronize the clock sources between 2 transmitters and 2 receivers, respectively [19].Then, they are connected to one host computer through an Ethernet switch.By using the unique host computer, the CSI can be obtained perfectly from the uplink to perform precoding, being also suitable for adding some more CSI errors.
We consider a Time Division Duplexing (TDD) system, where two users separately send orthogonal pilots to base station in the first time slot, and then the received pilots at the base station are used to estimate the uplink channel matrix.Base station then calculates precoding matrix with estimated CSI (in general imperfect CSI as presented above in this paper) and precodes symbols after parallel spinal encoders as Figure 1 shows.Precoded symbols are then transmitted through the downlink channel to users in the second time slot.Each user tries to retrieve original messages from received symbols and then sends ACK to based station if messages were decoded successfully.Figure 9 illustrates the user interference in the unique host computer.Three parts are mainly included, including parameters configuration (located on the left), retrieved-message display (on the middle), and numerical results display (on the right).
The decouple design of software and hardware can easily customize USRP as transmitter or receiver, which enables USRP to work correctly.USRP supports the input of synchronized signal from external clock sources; more than 2 functioned-well USRPs can be synchronized to build up a system with large transmit antennas and receivers.Therefore, both the function designs and devices support the scalability from a 2 * 2 MIMO system to a massive MIMO system.

Radio Frequency Calibration.
In our implemented TDD system as Figure 8 shows, the uplink and downlink channels are assumed to be reciprocal.However, real channels are not reciprocal due to the differences in transmitter and receiver hardware.An internal calibration method [20] is considered in our work, where we first find a reference radio; then if we know the calibration coefficient between any two radios and a reference radio, we can derive the direct calibration coefficient between them.With these calibration coefficients, the real channels will be derived.

Demo System's Signal
Frame Structure Design.Figure 10 gives a simple frame structure design based on TDD mode for the 2 × 2 demo system.At the uplink scheduling slot, we use preamble to obtain the start point of data and synchronize the system frequency; the orthogonal pilots are transmitted through uplink sound channel (UL-SCH) to obtain the CSI.At downlink scheduling slot, users' messages are precoded and transmitted through the downlink traffic channel (DL-TCH).UGI and DGI are guard intervals in the uplink and downlink channel, respectively.

Signal Detection Implementation Test Results
. Due to complicated field test condition limitations, we firstly test the above demo system's performance in fading channels by the channel simulator in LabVIEW software environments.The uplink and downlink signals are truly transmitted by radio.However, the distance limitation between transmit and receive antennas is so near, and we have to pass the signals through fading channels in LabVIEW software environments.The detailed parameters of the fading channels are presented in Table 1.Since the fading channel remains  stationary, we use the real achieved rate to replace the MAR to obtain the dynamic block size.The performance of retransmission times is compared by our proposed MPDBA and by fixed block-size scheme in Table 2. From this table, it can be seen that, by the MPDBA, the retransmission times can be reduced obviously in different SNR scenarios compared with the fixed block-size scheme.It also helps to verify the retransmission efficiency improvements of the proposed spinal codes based MIMO scheme in fading channels.

Conclusions
The application of spinal codes in massive MIMO with imperfect CSI has great benefits.Before spinal codes are integrated into the system, an efficient transmission scheme for massive MIMO with spinal codes is proposed, where the maximum achievable rate (MAR) is used to determine the block size dynamically to reduce the retransmission delay spent on decoding the messages, and multilevel puncturing scheme is considered for different MARs to limit the gap between the MAR and the achieved rate.Some theoretical analyses are provided to prove that the multilevel puncturing and dynamic block-size allocation (MPDBA) based spinal codes can guarantee the system achieved rate as well as retransmission efficiency.Numerical simulation results illustrate that spinal codes with MPDBA can improve the achieved rate efficiently over that based on the fixed-rate codes as well as enhancing the system retransmission efficiency with limited achieved-rate performance degradation.Therefore, the proposed method can guarantee spinal codes based massive MIMO system to be reliable and practical.Finally, we implement a simplified MIMO testbed over NI's USRP platform to verify the proposal's performance improvements, which can help to understand our theoretical analysis and can be easily extended to massive MIMO scenarios.

Figure 3 :
Figure 3: CDF curves of pass number needed for successfully decoding under different SNRs,  max >  upper  ,   = 0.1.

Figure 6 :
Figure 6: The effects of  sub, () on average rates achieved by spinal codes based massive MIMO.