AMultistage Decision-Feedback Receiver Design for LTE Uplink inMobile Time-Variant Environments

Single-carrier-frequency division multiple access (SC-FDMA) has recently become the preferred uplink transmission scheme in long-term evolution (LTE) systems. Similar to orthogonal frequency division multiple access (OFDMA), SC-FDMA is highly sensitive to frequency offsets caused by oscillator inaccuracies and Doppler spread, which lead to intercarrier interference (ICI). This work proposes a multistage decision-feedback structure to mitigate the ICI effect and enhance system performance in timevariant environments. Based on the block-type pilot arrangement of the LTE uplink type 1 frame structure, the time-domain least squares (TDLS) method and polynomial-based curve-fitting algorithm are employed for channel estimation. Instead of using a conventional equalizer, this work uses a group frequency-domain equalizer (GFDE) to reduce computational complexity. Furthermore, this work utilizes a dual iterative structure of group parallel interference cancellation (GPIC) and frequency-domain group parallel interference cancellation (FPIC) to mitigate the ICI effect. Finally, to optimize system performance, this work applies a novel error-correction scheme. Simulation results demonstrate the bit error rate (BER) performance is markedly superior to that of the conventional full-size receiver based on minimum mean square error (MMSE). This structure performs well and is a flexible choice in mobile environments using the SC-FDMA scheme.


Introduction
In recent years, cellular communication services, ranging from traditional voice traffic to data transmission, the data rate, and bandwidth efficiency of the wireless-network physical layer have improved.The international mobile telecommunications-advanced (IMT-Advanced) standard, which is promoted by the international telecommunication union (ITU), is 4th-generation (4G) mobile communication.The highest data rate for 4G uplink and downlink is 50 Mbps and 100 Mbps, and the highest bandwidth efficiency of uplink and downlink is 6.75 bits/s/Hz and 15 bits/s/Hz [1], respectively.Notably, IEEE 802.16 Worldwide Interoperability for Microwave Access (WiMAX) orthogonal frequency division multiple access (OFDMA) has become the physical layer standard [2] on the uplink and downlink to meet bandwidth efficiency requirements.However, OFDMA has the disadvantage of a high peak-to-average-power ratio (PAPR), which reduces power efficiency [3] of a power amplifier (PA) and is not suitable for mobile users with limited power.Therefore, the future 4G standard of the 3rdgeneration partnership project-long-term evolution (3GPP-LTE) system will employ a relatively low PAPR transmission scheme of a single-carrier frequency division multiple access (SC-FDMA) [4][5][6] as the uplink physical layer standards [7][8][9] to increase power efficiency.
The SC-FDMA system can be considered the first discrete Fourier transform (DFT) precoded OFDM system, which is similar to the traditional OFDM system.The performance of the SC-FDMA system is decreased by the Doppler effect in mobile time-variant environments.This effect destroys the orthogonality of subcarriers, leading to intercarrier interference (ICI).Moreover, the pilot arrangement of the LTE uplink is block type, differing from the comb type of the conventional OFDM system, such that channel estimation methods are needed to amend.
International Journal of Antennas and Propagation Some channel estimation methods with the block-type pilot arrangement have been developed.In [10], Karakaya proposed a Kalman-filter-based approach to mitigate ICI under high Doppler spread scenarios by tracking variation in channel taps and utilized an interpolation algorithm based on polynomial fitting for channel estimation.However, the ICI effect in highly mobile environments degrades system performance.Notably, [10] did not focus on ICI cancellation techniques in the frequency domain.In [11], Wang proposed a single-user and multiuser channel estimation scheme.The channel taps of each user can be estimated by the orthogonal features of the designed constant amplitude zero autocorrelation (CAZAC) sequence.However, [11] assumed a nontime-varying channel scenario; thus, this method cannot be applied to high-speed mobile environments.In [12], Zheng proposed an interpolation-based channel estimator of the frequency domain and proved that interpolation of the least squares (LS) method and that of the minimum mean square error (MMSE) method are equivalent.In [13], Zhang proposed a frequency-domain decision-feedback equalizer and applied the Lagrange multiplier method to replace general inverse-matrix operations.Compared with the traditional decision-feedback equalizer (DFE), [13] is less complex.
This work focuses on channel estimation and ICI cancellation techniques for receiver design.First, time-domain least squares (TDLS) channel estimation is applied for the block-type pilot arrangement.The curve-fitting estimator is applied to interpolate missing channel information between pilot symbols.Next, the low-complexity channel equalizer of the group frequency-domain equalizer (GFDE) is then applied.Furthermore, in considering the Doppler effect of mobile time-variant environments, this work applies dual iterative interference cancellation of group parallel interference cancellation (GPIC) and frequency-domain group parallel interference cancellation (FPIC) to reduce the ICI effect.Finally, this work proposes a novel error-correction scheme to optimize system performance.Simulations demonstrate that the proposed receiver design performs well and is a flexible choice in mobile environments using the SC-FDMA scheme.
The remainder of this paper is organized as follows.Section 2 introduces the SC-FDMA model and time-variant channel model.Section 3 divides the receiver design into five parts.Section 3.1 is the TDLS channel estimation scheme and the polynomial curve-fitting-based channel interpolation method.Section 3.2 is the low-complexity GFDE.Sections 3.3 and 3.4 are the dual ICI cancellations by GPIC and FPIC.Error correction, which consists of time-domain least squares and group maximum likelihood, is achieved in Section 3.5.Performance simulation results for the proposed system are given in Section 4. Section 5 offers conclusions.

System Model
Figure 1 shows the structure of the SC-FDMA transceiver.Constellation symbols of the pth user can be grouped into a data block Then, d (p) is transformed to the frequency domain using the N-point fast Fourier transform (FFT): where (p) is mapped onto the mth subcarrier s (p) (m) as follows: where N (k) denotes the resource allocation sets of the localized and distributed subcarrier mapping of the pth user shown in Figure 2, and Φ is a set of indices whose elements are Φ = {0, 1, . . ., M}. Resource allocation data block T is then acquired.
Following subcarrier mapping, s (p) is transformed into the time domain using the M-point inverse fast Fourier transform (IFFT): where In this work, we assume only one user exists, p = 1.
Then, the received signal over mobile fading channel can be expressed as where t is the transmitted signal with cyclic prefix (CP) insertion, C is CP length, h l (m) is the time-variant channel response of the lth path at discrete sampling time m, L is maximum delay spread, and w(m) is additive white Gaussian noise (AWGN) with N(0, σ 2 w ).After removing the CP at the receiver, the received signal can be written as where w is the M × 1 noise vector, t is the M × 1 transmitted signal, and h is an M × M circulant matrix as follows:

The Proposed Multistage Decision Feedback Receiver
To cancel the ICI effect caused by mobile environments, the proposed multistage decision-feedback receiver design is analyzed.Figure 3 shows the block diagram of the multistage decision-feedback receiver.First, channel response is estimated by the TDLS [14].The estimated channel response is utilized to facilitate the multistage decision-feedback design, which has low complexity, ICI cancellation, and improved data correction.The details of these procedures are described and analyzed as follows.

The Time-Domain Least Squares
Algorithm.This section describes the channel estimation scheme for estimating the channel response between pilot symbols of two consecutive slots within a frame.In the LTE uplink type 1 frame structure with extended CP shown in Figure 4, a reference signal is allocated in the fourth symbol in one slot with a total of six symbols.The reference signal is the Zadoff-Chu sequence with low autocorrelation sidelobes [15].The pilot arrangement in the LTE uplink is a kind of block-type arrangement [9], meaning that pilot signals are inserted into all subcarriers of the frequency domain.
In the following, TDLS estimation is analyzed.Figure 5 shows the block diagram of TDLS.
For time-invariant channel, the received pilot signal can be expressed as International Journal of Antennas and Propagation where P circ is an M × L circular matrix formed by the pilot symbol p = [P(0), P(1), . .

. , P(M − 1)] T and h
In order to estimate the approximated time-variant channel, the received pilot signal be rewritten as where path.To increase the effectiveness of the channel response, y P and P circ can be expressed as y K = y P (K), y P (K + 1), . . ., y P (K + J − 1) T , ( 9) where y P (K) is the Kth element of the received pilot vector y P and K = 0, 1, . . ., M − J − 1 denotes the group number.P circ (K : K + J − 1, 1 : L) represents the submatrix from the Kth row to the (K +J −1)th row and the 1st column to the Lth column of matrix P circ .Next, the average channel response L × 1 vector h ave (K) of group K can be obtained as In rearranging the average channel response vector above, we define h ave = [h ave (0), h ave (1), . . ., h ave (M − J − 2)] as the L × (M −J −1) channel response matrix within the pilot symbol.After obtaining the channel response within the pilot symbol, this work utilizes a polynomial curve-fitting scheme based on the linear model estimator [10] to interpolate the missing i time slot j time slot channel information between pilot symbols in Figure 4.The linear model [16] of fading channel profile can be expressed as

Number of samples
where Ξ l = [ h l (m i,0 ), . . ., h l (m i,M−J−1 ), h l (m j,0 ), . . ., h l (m j,M−J−1 )] T denotes the lth path response of h ave under two consecutive pilot symbols of the (i, j) time slots with size 2(M−J−1)×1; h l (m i,a ) is the channel response of the lth path derived from the TDLS; m i,a is the ath time instant of the ith time slot, where a = 0, . . ., M − J − 1.The relation between time slot (i, j) and m i,a is shown in Figure 6; vector of polynomial coefficients of the lth path.Next, the polynomial coefficients can be obtained by the least squares solution: Therefore, the channel response h l CF (m) of polynomial-based curve-fitting between two consecutive pilot symbols can be approximated as Thus, the MSE performance of curve-fitting channel estimation can be defined as where h l (m) is the ideal channel response of the mth time instant of the lth path.After estimating the channel impulse response, this work reconstructs the frequency domain channel response matrix H [17] as Moreover, MSE performance of curve fitting depends on the polynomial order v and time interval J.In particular, the size of variable J directly affects the performance of curvefitting.
For two extreme examples, Figures 6 and 7 show the real-part mobile channel responses with the size of J = 4 and J = 100, respectively.In case of J = 4 and M = 128, M − J − 1 = 123 estimated channel responses by TDLS are obtained, and the variation between the TDLS channel and the real channel is larger.But the bias between the curve-fitting channel and the real channel is smaller.Contrarily, in case of J = 100 and M = 128, M − J − 1 = 27 estimated channel responses by TDLS are obtained, and the variation between the TDLS channel and the real channel is smaller.However, the bias between the curve-fitting channel and the real channel is larger.Such as the previous examples, the tradeoff between size of J and curve fitting should be considered.To optimize the performance of channel estimation, more details will be discussed in simulation results.

Group Frequency Domain
Equalizer.This section describes the GFDE of the first stage.In a time-variant environment, H is a sparse frequency-domain channel matrix whose energies centralize at diagonal elements; as the farther from the diagonal line the gain of elements decreases.Therefore, this work applies the GFDE to reduce the ICI effect and computational complexity.Figure 8 shows the block diagram of the GFDE.First, frequency domain signal vector x with size N × 1 is derived from the subcarrier demapping of signal vector u.Then, x is divided into G = N/D groups shown in Figure 8.Each group x g with size D × 1 can be expressed as  detection z g can be obtained roughly by the group minimum mean square error (GMMSE) where where I D is a D×D identity matrix and ρ is the noise variance of w g .In the first stage, data detection z g of the GFDE suffers the ICI effect of marginal elements of the group block of H N .
In the following second stage, the ICI effect of group x g is mitigated.

The Group Parallel Interference Cancellation of Frequency-Domain Soft-Decision Feedback.
In the previous section, GFDE is applied to reduce computational complexity and equalize the channel effect.However, the performance degradation of GFDE is caused by the loss of some channel information: For example D = 2 in (20), the GFDE equalizes the channel effect of the dotted circle.Due to the characteristic of the sparse frequency-domain channel matrix H N , the marginal ICI close to the diagonal of the solid circle still has larger energies and affects the system performance.In order to mitigate the marginal ICI effect, the GPIC of frequency domain soft decision feedback is applied in this section.Figure 9 shows the block diagram of GPIC.
The GPIC is an iterative decision feedback structure shown in Figure 9.In the first loop (n T of size N × 1 be the initial decision; then, signal x g can be derived by marginal ICI cancellation where z(gD − 1) and z(gD + D) are the (gD − 1)th and (gD + D)th elements of z, respectively.Next, z g is derived by the GFDE While the loops are equivalent or larger than 2 (n ≥ 2), let T be the decision data and redo ( 21) and ( 22) to refine signal z.After iterative processing, the decision data ] T can be derived by

The Group Parallel Interference Cancellation of Time-Domain Hard-Decision Feedback.
The parallel interference cancellation (PIC) [18] can be applied to improve data accuracy and leads the initial decision to be more reliable.
In this section of the third stage, this work utilizes the FPIC of time-domain hard-decision feedback to mitigate the ICI effect further. Figure 10 shows the block diagram of FPIC.
In the first loop (n = 1) of the third stage, the ICI term ICI i of the ith subcarrier signal can be reconstructed by d 1 ( j) Next, the ICI term ICI i is subtracted from the subcarrier de-mapping signal x, and signal x i can then be acquired where denotes the ith composite channel vector, and w N (i) is AWGN with respect to x i .The output z(i) of the MMSE equalizer is given by The decision output T can be derived by Next, this work applies the iterative process when loops are equivalent or larger than 2 (n ≥ 2).Let d 2 replace d 1 and reapply ( 24)-( 27) to improve decision reliability.

Error Correction Consisting of Time-Domain Least
Squares and Group Maximum Likelihood.In this section, this work proposes a novel error-correction scheme on the fourth stage.Figure 11 shows the block diagram of error correction.
After obtaining decision data d 2 , let d 2 conduct SC-FDMA modulation equation ( 1)-( 3) to construct the transmitted signal D = [D(0), D(1), . . ., D(M − 1)] T .Equation (10) can be rewritten as where D circ is circular matrix formed by the signal D.Then, the reconstructed channel h Data (K) can be expressed in the following from (11): In rearranging the reconstructed channel response vector above, we define h Data = [ h Data (0), h Data (1), . . ., h Data (M − J − 2)] as the L × (M − J − 1) channel response matrix of each path within the data blocks.We assume FFT size is 128 and the occupied subcarrier number of the desired user is 72. Figure 12 shows the reconstructed real-part channel impulse response.
It is obviously to find the burst error of the reconstructed channel h Data when decision data d 2 errors.To find the subcarrier set Ω i of the decision error according to the burst error of the channel, this work builds a lookup table using a histogram-based approach.The subcarrier set Ω i of decision error is derived by dividing occupied subcarriers of the desired user into 12 blocks; each block has the same size-P = 6.Table 1 shows the lookup table of the burst error location between the channel and decision data.

International Journal of Antennas and Propagation
. . .Additionally, this error correction scheme is only suitable for a high signal-to-noise ratio (SNR), that is, E b /N 0 ≥ 25 dB, because the performance of the TDLS is sensitive to the SNR.The following presents the details of the proposed error correction scheme.
(1) Let d 2 conduct SC-FDMA modulation and use the received signal to reconstruct the channel impulse response h l Data by the TDLS, where h l Data denotes the lth path of the reconstructed channel h Data between two consecutive pilot symbols, and l = 0, 1 . . ., L − 1.
(2) Compare h l Data with h l CF of curving fitting, then find the range of the error burst where index denotes the location of the error burst, and ε = 10% is the percentage of channel difference.If no channel difference exists under the condition Δh ≥ ε, skip error correction.Conversely, one must identify the position of maximum error burst.index max = max(Δh(index)). (31) (3) Find subcarrier set Ω i of decision error with respect to index max by Table 1.
(4) In this step, a localized maximum-likelihood (ML) search is applied to correct decision data of subcarrier The technical procedures of the localized ML are as follows: where Ψ ML is a set of all possible transmit symbol vectors that consists of Q P elements, P is the subcarrier size of Ω i , Q are constellation points, and h (l) Data, d3(Ωk) denotes the reconstructed channel response, which is constructed by the TDLS with respect to candidates d 3 (Ω k ).
After applying the above error-correction scheme, let d 3 replace d 2 and redo steps 1-4.The iterative structure can be used to optimize the performance of data accuracy.

Simulation Results
Based on the previous analyses, simulations are performed to assess the performance of the proposed multistage decisionfeedback equalization schemes for LTE uplink systems.Table 2 lists the parameter settings of the simulation environment.
First, this work discusses the MSE performance of the time-varying channel estimation of the TDLS.The MSE performance depends on the polynomial order v and time interval J. Figure 13 shows the MSE performance of TDLS with different size v.It is obvious that the MSE performance varies with size of polynomial v and E b /N 0 range.Based on the rule of thumb of the simulation result in Figure 13, the adaptive order v with different E b /N 0 range can be expressed as Based on the adaptive order v in (33), Figure 14 shows the MSE performance of TDLS with different size J.When  time interval is J ≤ 12, MSE performance increases as time interval J increases.However, when time interval J is too large (J ≥ 100), group number K is not sufficient to support the need for statistical curving fitting; thus, MSE performance decreases.According to the simulation result in Figure 14, the optimum selection of the time interval size J = 12 is considered.
To demonstrate the effect of block size D of the GFDE in the second simulation, the performance of GFDE improves when block size D is large, which is shown in Figure 15.At BER = 2× 10 −4 of GFDE, gain losses are about 2 dB and 1 dB for D = 4 and D = 8, respectively, as compared to the fullsize MMSE equalizer.To reduce computational complexity, the suitable block size of GFDE is D = 4.
Figure 16 shows the BER performance of cascades of multistage equalizers.Obviously BER performance can be improved by increasing the number of stages.The performance of the proposed GFDE (D = 4) and GPIC can International Journal of Antennas and Propagation approach that of the full-size MMSE equalizer below E b /N 0 = 20 dB.To enhance performance, the proposed FPIC and error-correction schemes are employed.The performance of the overall receiver design (D = 18) is improved by about 1-2 dB, as compared with that of the full-size MMSE equalizer in Figure 16.Furthermore, the comparison of computation complexity of proposed receiver and full-size MMSE receiver is analyzed in the following.In the first stage of GFDE scheme equations ( 18)-(19), the number of complex multiplication is computed about O(D 3 + 2D 2 ).In the second stage of GPIC scheme equations ( 21)-(23), the number of complex multiplication is computed about O(D 3 + 2D 2 + 2D + Nlog 2 N).In the third stage of FPIC scheme equations (24)-( 27), it involves the advantage of d 1 ( j) with the fixed values (i.e., QPSK symbol: ±1 ± j).Therefore, the complex multiplication can be reduced for the ICI reconstruction and the number of complex multiplication of FPIC is computed about O(N 2 +3N +Nlog 2 N).In the fourth stage of error-correction scheme, it includes SC-FDMA modulation equation (3), channel reconstruction equation (29), and ML search equation (32); the number of complex multiplication can be computed about O(M log 2 M + (M − J − 1)(L 3 + L 2 + LJ) + 3(M − J − 1)L).Besides, the number of complex multiplication of the full-size MMSE is computed about O(M 3 ).For example, considering the case (GFDE D = 18 and GPIC and FPIC and error correct) in Figure 16 with N = 72, M = 128, D = 18, L = 4, and J = 12, the number of complex multiplication of the proposed receiver is determined about 242, 360.And considering the full-size MMSE receiver with M = 128 in Figure 16, the number of complex multiplication is calculated about 2,097,152.It is obviously that the proposed multistage receiver can provide the advantage of the lower computation complexity than the full-size MMSE receiver.

Conclusions
This work proposes a multistage decision-feedback receiver design for LTE uplink systems.First, TDLS channel estimation is applied for block-type pilot arrangement; then the curve-fitting estimator is applied to interpolate missing channel information between pilot symbols.The low-complexity channel equalizer GFDE is employed.Furthermore, in considering the Doppler effect of mobile time-variant environments, this work employs dual iterative interference cancellation by GPIC and FPIC to mitigate the ICI effect.Finally, the novel error-correction scheme combining TDLS and group maximum likelihood is applied to optimize the system performance.Simulation results demonstrate that the performance of proposed channel estimation scheme is good when a suitable time interval J and polynomial order v are chosen.Due to the properties of each stage, the proposed receiver design markedly improves BER performance.This multistage design is more flexible than traditional structure and feasible for mobile time-variant environments.

Figure 3 :Figure 4 :
Figure 3: The block diagram of multistage decision-feedback receiver.

Figure 6 :
Figure 6: Real part mobile channel profile with size of J = 4.

2 Figure 7 :
Figure 7: Real-part mobile channel profile with size of J = 100.

Figure 13 :
Figure 13: The MSE performance of TDLS in different cases of size v.

Table 1 :
Lookup table of burst error location.