Pilot Contamination Mitigation via a Novel TimeShift Pilot Scheme in Large-Scale Multicell Multiuser MIMO Systems

We propose a novel time-shift pilot scheme to mitigate the pilot contamination in large-scale multicell multiuser MIMO (LSMIMO) systems. In the proposed scheme, the length of the uplink training pilot sequence is equal to the cell number; that is to say, the same pilot sequence is used within a cell, while for different cells, pilot sequences are mutually orthogonal. Moreover, users within a cell transmit the same pilot sequence in a time-shift manner during the channel estimation stage and in this way all user terminals’ channel state information can be estimated without contamination. The asymptotic channel orthogonality is studied in the LS-MIMO system, with which the mutual interference among cells caused by data and pilot sequences can be cancelled with the successive interference cancellation (SIC)method.We explore the superiority of the proposed scheme in channel coefficient estimation, uplink data detection, and downlink data transmission steps. Theoretical analysis and simulation results demonstrate that the proposed time-shift pilot design can alleviate the pilot contamination problem and improve the performance of the considered system significantly compared with the popular orthogonal pilots.


Introduction
Large-scale multicell multiuser MIMO (LS-MIMO) system is a novel and a promising network architecture, which provides breakthrough data rate by using a large excess of serviceantennas over user terminals (UTs).Indeed, adding more antennas is always helpful for increasing throughput, reducing radiated power, offering uniformly great service everywhere in the cell, and simplifying signal processing [1][2][3][4][5].
For the LS-MIMO system, channel state information (CSI) estimation is a challenging issue for achieving multiantenna gain.Generally, uplink training is often used to obtain CSI in time-division duplex (TDD) systems due to channel reciprocity.However, the required resource for orthogonal pilot sequences increases dramatically as the user number grows large.Besides, as the number of base station (BS) antennas tends to infinite, additive Gaussian noise and uncorrelated interferences vanish, and the only remaining is the correlated intercell interferences caused by users having the same pilot sequences in other cells [5].Under the limitation of finite pilot sequence length, it is inevitable to reuse the same set of orthogonal pilot sequences in adjacent cells, which causes the pilot contamination and brings a quick saturation to the system's performance.Thus, pilot contamination is a bottleneck which limits the performance advantages LS-MIMO can offer [6,7].
Recent studies make an effort to address this problem [8][9][10][11][12].Although they tried to alleviate the pilot contamination between multiple cells, they still used orthogonal pilot sequences in a signal cell which means a large pilot resource consumption especially when coherence interval is short and the user number is large in LS-MIMO systems [13].
Inspired by the idea in [13], in this paper, we propose a novel time-shift pilot scheme to mitigate the pilot contamination while considering the pilot resource consummation in the LS-MIMO system.As we know, in [13], the pilot sequence is transmitted in a time-shift manner to save more resource.Here, we extend this technology to the LS-MIMO system to deal with the pilot contamination problem.The main aspect of this extension is the proposal and analysis of the performance of the pilot scheme that alleviates the pilot contamination.In the proposed pilot scheme, the number of mutually orthogonal pilot sequences is equal to the number of cells and users within a cell transmit the same pilot sequence in a time-shift manner.In this way, all BSs can estimate the channel coefficients at the same time.Furthermore, the interference caused by data and pilot sequences can be cancelled by exploiting the asymptotic channel orthogonality combined with the SIC method in LS-MIMO systems [13].
We explore the superiority of the proposed scheme in channel coefficient estimation, uplink data detection, and downlink data transmission.Simulation results demonstrate that the proposed pilot scheme can alleviate the pilot contamination and achieve a high data rate.Moreover, the achievable rate grows with the BS antenna number with the proposed pilot scheme, while the achievable rate saturates rapidly with the conventional scheme.This paper is organized as follows.In Section 2, we describe the system model and the conventional pilot sequence scheme.Then in Section 3, we study the problem of the proposed pilot scheme including channel coefficient estimation and data detection in uplink and downlink data transmission stage.In Section 4, we analyze the performance of the proposed method in terms of the achievable rate.In Section 5, we present numerical simulation results and show that the proposed pilot scheme can solve the pilot contamination and achieve a much better performance than the conventional pilot scheme.

System Model
We consider a cellular system composed of  cells, each consisting of  served signal-antenna users and one BS with  antennas.Let   ,   , and   be the pilot sequence SNR, the uplink data SNR, and the downlink data SNR, respectively.Denote the channel coefficient from the th user in the th cell to the th cell as g  = √  h  ,  = 1, 2, . . ., ,  = 1, 2, . . ., , where the small-scale fading vector h  ∼ CN(0, I  ) is statistically independent across users and   models the large-scale fading, which is assumed to be constant and known as a priori.
We assume a block fading model, where the small-scale fading vector remains constant during a coherence block of  symbols and is independent in different coherence blocks.In TDD systems, the channel coefficients are the same for both uplink and downlink data transmission in a coherence interval.
The conventional transmission system with orthogonal pilot sequences is shown in Figure 1.We give a coherence block for analysis, as the remaining coherence blocks can be analyzed in the same way.Each coherence interval is assumed to be organized in three phases:  symbol periods for uplink data transmission,  symbol periods for downlink data transmission, and  symbol periods for uplink pilot sequences transmission.Therefore, the length of one coherence interval is  =  +  + .
In conventional transmission system, orthogonal pilot sequences are assigned to each user to prevent pilot contamination within one cell, while they cannot prevent pilot contamination between cells.On the other hand, the length of pilot sequence is  ( > ), which is too large when  is large in LS-MIMO system.In such case, a lot of resource is consumed by transmitting pilot sequences.Now, we give some notational definitions. represents the index of the coherence interval.y   [] ∈ C × denotes the data received by the th BS when the th UTs of all cells transmit pilot sequences.
z   [] ∈ C × denotes the additive noise received by the th BS when the th UTs of all cells transmit pilot sequences.
∈ 1 ×  denotes the pilot sequence used in the th cell.q   [] ∈ C 1× denotes the data transmitted by the th user in the th cell to the th cell when the th UTs of all cells transmit pilot sequences.
q  [] ∈ C 1× denotes the data transmitted by the th user in the th cell to the th cell when all channel coefficients have been estimated.
g  [] denotes the channel coefficient between the th user in the th cell and the BS of the th cell.
s  [] ∈ C 1× denotes the data transmitted by the th BS to the th user in the same cell.
2 , denotes the variance of the channel coefficient between the th user in the th cell and the BS of the th cell.

Time-Shift Orthogonal Pilot Scheme
To alleviate the problem of pilot contamination and the waste of resources, a novel time-shift orthogonal pilot scheme is proposed in this section, which reduces the length of pilot sequence for each UT and guarantees the system performance due to the merit of asymptotic channel orthogonality when the number of BS antennas is large.The proposed scheme is shown in Figure 2, obviously, the length of pilot sequence is equal to the cell number and for UTs in one cell, the same pilot sequence is transmitted in a time-shift manner.We assume that all cells transmit all kinds of data synchronously.
In the first coherence interval, the first UTs of each cell transmit the pilot sequence while other UTs mute so that each cell can estimate their first UTs' channel coefficients g  ,  = 1, 2, . . ., ,  = 1, 2, . . ., , without contamination.At the same time, cross channel coefficients g  , ,  = 1, 2, . . ., ,  ̸ = ,  = 1, 2, . . ., , can also be estimated, which can be used later for interference cancellation.When the second UTs of each cell transmit their pilot sequences, the first UTs transmit uplink data and other UTs still mute.In short, when the th UTs of each cell transmit pilot sequences, the UTs whose index is less than  transmit the uplink data.Besides, the interference from other users can be cancelled by using SIC method with the estimated channel coefficients, which will be described later.
Then the estimated channel coefficients within one cell or between cells are used in the following uplink data reception, downlink data transmission, and interference cancellation steps.After the first coherence interval, UTs in all cells transmit uplink data except the ones that transmit pilot sequence to update their channel coefficients during the CSI estimation stage.
It is clear that, for all UTs, the length of the pilot sequence depends on the cell number , and pilot sequences in different cells are mutually orthogonal to guarantee there is no pilot contamination among cells.Within a cell, UTs transmit the same pilot sequence in a time-shift manner so that we can conduct the channel coefficient estimation one by one without contamination.

Uplink Data Transmission in the First Interval.
When the first UTs in all cells transmit pilot sequences in the first coherence interval, the  ×  vector received by the th  = 1, 2, . . ., , BS is where z  ∈  ×  is the additive noise and the entries of z  are i.i.d.CN(0, 1) random variables and that all gains are scaled accordingly.The th BS estimates the channel coefficients by using Minimum Mean Squared Error (MMSE) as where ,  = 1, 2, . . ., .According to MMSE estimation, , and g1 ∼ CN(0,  2 1,1 I  ) are the estimation errors which are independent of ĝ1 [1], where ).Once each BS obtains the first UTs' channel coefficients to its cell, the first UTs transmit the uplink data, while the next UTs transmit pilot sequences.
We describe the data revived at the th BS when the th UTs ( > 1) transmit pilot sequences as Then the estimated channel coefficients are applied to the detection of the first UTs' uplink data as where ‖ ⋅ ‖ denotes the 2-norm.
According to the properties of MMSE estimation, the estimated channel coefficient ĝ1 [1] is independent of g1 [1], g  [1], and z   .We divide both the numerators and the denominators in (4b) by  and we get (4c) by applying Lemma 1 which describes the limit results of random vectors [14].

󳨀 󳨀 → denotes almost sure convergence.
When all channel coefficients have been estimated, the data received at the th BS is Then we can use the similar method as in (4a), (4b), and (4c) to get the first UTs' uplink data It is clear that, with the proposed pilot scheme, we get the first UTs' data of all cells without contamination.As for the other UTs in the first coherence interval, we combine the SIC method with the MMSE method to conduct the channel coefficient estimation.Then next, the th UTs' channel coefficients estimations are taken for an example.

International Journal of Antennas and Propagation
We assume that the data q [1] estimated by the way in ( 6) is accurate and can replace the real q  [1].Then, the interference caused by the other UTs' data when the th UTs transmit pilot sequences can be removed by q [1] and ĝ [1] ( <  ≤ ), which are obtained before the present period [13].The data received by the th BS after processing is where  = 1, 2, . . ., ,  = 2, 3, . . ., , and o   [1] denotes the interference consisting of the estimation error and noise item.Obviously, by changing , we can estimate the channel coefficients of the th UTs in different cells to the th cell.Based on the assumption that q   [1] is an independent Gaussian sequence and g [1] ∼ CN(0,  2 ,1 I  ), we get Note that o   [1] is independent of g  [1], and then we get the MMSE estimation of g  [1] as In the same way, the estimated channel coefficient can be decomposed as g  [1] = ĝ [1] + g [1].From the properties of MMSE estimation, g [1] ∼ CN(0,  2 ,1 I  ), where  2 ,1 =   −  2 ,1 and Next step, we use the estimated channel coefficients to detect the th UTs' uplink data when the th ( > ) UTs transmit pilot sequences.As the estimated channel coefficient ĝ [1] is independent of g  [1] and similar to the steps in (4a), (4b), and (4c), at the th BS, the detected data of the th UT in the th cell is Similar results can be gotten when UTs of all cells transmit uplink data From above equations, each BS can get all UTs' CSI and detect their uplink data in the first coherence interval.
Besides, the estimated channel coefficients can be used for the following downlink data precoding and interference cancellation.
In the next part, we will study the downlink data transmission with the proposed pilot scheme.

Downlink Data Transmission in the First
Interval.With the estimated channel information, each BS calculates the beamforming vector to its th UT in the normalized version that The data received by the th UT in the th cell is where k  [1] is the unit variance additive white Gaussian noise in the downlink data transmission.Note that the beamforming vector ĝ [1] is independent of g [1] and g  [1] ((, ) ̸ = (, )), and we rewrite the norm of the estimated channel information as ‖ĝ  [1] Then the downlink data estimation at the th UT in the th cell is where we divide both the numerators and denominators in (16b) by , and when  → ∞, according to (15), the only left is the desired signal.In later intervals, we have the similar analytical approach for downlink data transmission.
International Journal of Antennas and Propagation 5

Data Transmission in Later Intervals.
Coherence intervals after the first one have the same transmission pattern during channel coefficient estimation stage.When the th UTs of all cells transmit pilot sequences, all the other UTs transmit uplink data, which is different from the channel coefficient estimation in the first coherence interval.However, the uplink and downlink data transmission can be analyzed similarly to the first coherence interval after the channel coefficient estimation stage.Without loss of generality, the th UTs' channel coefficients estimations are taken for an example, and the data received at the th BS in the th coherence interval can be expressed as On a similar plan, following the steps of ( 7) and ( 8), we can get the channel coefficients of the th UTs of each cell to the th cell by using pilot sequences corresponding to different cells.Generally, we choose   as an example, and the estimated channel coefficients are Again, the channel coefficient can be decomposed as g  [1] = ĝ [1]+ g [1].From the properties of MMSE estimation, ĝ [] ∼ CN(0, Equations from ( 17) to (19) can be applied to all coherence intervals except the first one.In this way, all BSs can obtain the UTs' channel coefficients of all cells without pilot contamination, which is important for the following uplink and downlink data transmission.
In the next section, we investigate the achievable rate that can be attained using our proposed scheme and compare the results to that obtained by the conventional scheme.

Performance Evaluation
With the proposed pilot scheme, both BSs and UTs can detect the desired signals during every symbol time without contamination.Besides, as  grows large, the length of the pilot sequence in the proposed pilot scheme does not become long, which saves more resource for data transmission compared with the conventional pilot scheme.However, estimation errors are inevitable because the time-shift pilot design combined with the SIC method is adopted, which limited the performance of the proposed pilot scheme to a certain extent.
In this section, four parts (1, 2, 3, and 4) are divided for simplifying performance evaluation as depicted in Figure 2. In the LS-MIMO system, for both the uplink and downlink data transmission, the th user achievable rate in the th cell is defined as where   is the signal to interference-plus-noise ratio (SINR).
Next, the performance analysis will be mainly focused on the SINR calculation during each part to investigate the potential benefits that our proposed scheme can provide.

Uplink Data Transmission in the First
Part.For the first part, assume that the th UTs are of interest, the BS applies the Maximal Ratio Combining (MRC) by multiplying the received data with the estimated channel coefficient, and based on (3), during a symbol time, we get The first term is the desired signal and the others are interference and noise terms.According to Khintchine's law of large numbers and some basic manipulations [15], we can have the approximation of (22b) for a large , and the power of the desired signal in ( 21) is where [⋅] is the expectation operation.(22c) is approached by using Lemma 2 [13].
Lemma 2. Let x, y ∈  × 1 be two mutually independent vectors with elements distribution CN(0,  2 ), and we get International Journal of Antennas and Propagation The power of the interference plus noise is According to (20), the achievable uplink rate is ) . (25)

Uplink Data Transmission in the Second
Part.After the channel coefficient estimation stage, all UTs transmit uplink data and this part exists in every coherence interval.Without loss of generality, we take the th coherence interval for an example.The data of the th UT received at the th BS can be represented as The power of the desired signal is The power of the interference plus noise is Then, the achievable uplink rate is ) .(29)

Uplink Data Transmission in the Third
Part.In this part, channel coefficient estimation is contaminated by all the other UTs' uplink data.When the th UTs transmit pilot sequences, the data detection of the th ( > ) UTs should use the estimated channel coefficients in the previous coherence interval.For example, the data of the th UT received at the th BS can be represented as where  1 = (,  < ; −1,  > ).The power of the desired signal is The power of the interference plus noise is The achievable uplink data rate is ) . (33)

Downlink Data Transmission in the Fourth
Part.After uplink data transmission stage, BSs transmit data to UTs with the estimated channel coefficients.According to ( 13) and ( 14), the data received by the th UT of the th cell can be rewritten as International Journal of Antennas and Propagation 7 Applying similar analytical procedures for the uplink, it is not hard to get the UT's downlink achievable rate ) .
(35) 4.5.Conventional Transmission as a Benchmark.For data transmission with the conventional pilot scheme, as shown in Figure 1, it is obvious that the transmit pattern is the same during every coherence interval.Hence, the uplink data of the th UTs received at the th BS is [12] Then the power of the received signal is ( For conventional downlink data transmission, consider the pilot contamination and the data received by the th UT in the th cell is Similar analytical procedures are applied and the achievable rate can be represented as International Journal of Antennas and Propagation For further comprehension, we divide both the numerators and denominators in (40) by , and when  → ∞, we get which is the same as the result in [13].
Comparing (25), (29), and (33) with ( 38) and ( 35) with (40), the achievable rate in the conventional pilot scheme has large denominators due to pilot contamination, which is not the case in the proposed pilot scheme.Therefore, from all the above theoretical analysis, the proposed pilot scheme can achieve a better tradeoff between pilot resource consumption and CSI estimation accuracy and it outperforms the conventional one in many scenarios.In the next section, we will show the superiority of the proposed pilot scheme by some numerical results.

Numerical Results
In this section, some numerical results are shown about the proposed pilot scheme.The achievable rate of uplink and downlink data transmission are added up for better performance comparison and the average achievable rate of each cell is used as a performance indicator.Besides, the overall power of the conventional pilot scheme and the proposed pilot scheme is set to be the same.
The scenarios simulated here consist of a cellular system with  users uniformly distributed in each cell and the cell number is set to 3. Without loss of generality, assume that large-scale fading coefficients from the BS to the UTs in the same cell are 1 and each BS is equipped with  = 128 antennas.For simplicity, we set   =   =   .
First, we calculate the average achievable rate of each cell and compare our proposed pilot scheme with the conventional one under different coherence intervals Nc when SNR varies from −40 dB to 60 dB.In this simulation, the cross large-scale fading coefficients which are the coefficients from the BS to the UTs in other cells are all set to 0.5 (  = 0.5).The result is shown in Figure 3.
The simulation result in Figure 3 shows the superiority of the proposed pilot scheme in mitigating the pilot contamination.The average achievable rate of the proposed scheme with different Nc surpasses the conventional one when SNR varies from −8 dB to 60 dB and still grows with SNR, while that of the conventional pilot design tends to saturate in a low SNR.Besides, a larger Nc brings more CSI estimation error which leads to a smaller average achievable rate.However, Nc does not impact more on the comparison.
Then we set Nc = 3 and compare the performance between two pilot scheme under different cross large-scale fading coefficients   .
Figure 4 shows the average achievable rate for   from 0.4 to 0.7 under varying SNR condition.Obviously, from ( 9) and (10), a large   brings more CSI estimation error for the proposed pilot scheme which has a negative effect on the average achievable rate, while that of the conventional one  suffers from the pilot contamination and achieves a much lower average achievable rate than the proposed one.
In Figure 5, we investigate the influence of the antenna number on the average achievable rate.It can be observed that the proposed pilot scheme clearly outperforms conventional one over the whole range of antenna number.It is also evident that the average achievable rate of the proposed pilot scheme grows with the antenna number while that of the conventional one does not change.This strengthens the superiority of the proposed pilot scheme.

Conclusion
This paper proposes a novel time-shift pilot scheme in TDD large-scale multicell multiuser MIMO system, in which, UTs within a cell transmit the same pilot sequence in a timeshift manner while pilot sequences assigned to different cells are mutually orthogonal.By using this pilot scheme, we can effectively alleviate the pilot contamination.Simulation results show that the performance of the proposed scheme outperforms the conventional one under several different scenarios and particularly the proposed pilot is more potential when the antenna number is large.However, a large coherence interval Nc and cross large-scale fading coefficients   will bring more CSI estimation errors which is not good for the overall achievable rate.Our next work will focus on this problem to investigate a higher performance method.
g   g       2 } desired signal