Estimation of Cyclic Shift with Delayed Correlation and Matched Filtering in Time Domain Cyclic-SLM for PAPR Reduction

Time domain cyclic-selective mapping (TDC-SLM) reduces the peak-to-average power ratio (PAPR) in OFDM systems while the amounts of cyclic shifts are required to recover the transmitted signal in a receiver. One of the critical issues of the SLM scheme is sending the side information (SI) which reduces the throughputs in wireless OFDM systems. The proposed scheme implements delayed correlation and matched filtering (DC-MF) to estimate the amounts of the cyclic shifts in the receiver. In the proposed scheme, the DC-MF is placed after the frequency domain equalization (FDE) to improve the accuracy of cyclic shift estimation. The accuracy rate of the propose scheme reaches 100% at E b/N 0 = 5 dB and the bit error rate (BER) improves by 0.2 dB as compared with the conventional TDC-SLM. The BER performance of the proposed scheme is also better than that of the conventional TDC-SLM even though a nonlinear high power amplifier is assumed.


Introduction
Orthogonal frequency division multiplexing (OFDM) is multicarrier modulation which provides reliable high speed data rate because of its high spectral efficiency and its robustness against multipath fading channel. One of the significant problems of the OFDM signal is its high peakto-average power ratio (PAPR) that requires wide range linearity in a power amplifier (PA). The high PAPR may drive a power amplifier into the saturation region, create interference among subcarriers, and corrupt the spectrum of the signal [1][2][3]. Some schemes for reducing PAPR have been available, for example, coding, filtering and clipping, and phase manipulation (selective mapping and partial transmit sequences).
Selective mapping (SLM) is one of the popular PAPR reduction schemes without signal distortion. SLM is a probabilistic scheme where signal candidates (SCs) are generated by multiplying the original signal sequence and the phase sequence. The SC with the lowest PAPR is chosen for transmission. In SLM, side information (SI) is needed at a receiver side to recover the transmitted signal. The SI is usually transmitted as a set of bits for every OFDM symbol and channel coding is required to protect it from a harsh channel. It involves the reduction of throughputs in wireless OFDM systems. Moreover, SLM has large computational complexity because it requires several inverse discrete Fourier transform (IDFT) operations and it makes the restrictions in implementation. A lower-complexity SLM scheme has also been proposed to solve this problem [4][5][6][7][8][9][10][11][12][13][14][15].
Many schemes have been proposed to exclude the SI [9][10][11][12][13][14][15][16]. The scheme in [9] makes the difference between the average energies of the extended and nonextended symbols to recover the SI at the receiver. As a consequence, higher order modulation symbols would influence the accuracy of SI detection. The scheme in [10] realizes semiblind SI detection in the SLM. However, this scheme requires embedding the SI in transmit symbols. In the scheme presented in [11], time domain cyclic-SLM with delayed correlation (DC) is applied to reduce a PAPR and to estimate the amount of a cyclic shift at the receiver without SI transmission. Nevertheless, there is a tradeoff between the amount of PAPR reduction and the BER. The method in [12] has been proposed to further reduce the PAPR of the abovementioned scheme. It uses matched filtering (MF) with a Barker sequence to estimate the amounts of cyclic shifts. One of the causes of the estimation error is multipath components. These components mislead the outputs of the DC-MF.
In this paper, time domain cyclic-SLM (TDC-SLM) without SI transmission is proposed. The proposed time domain cyclic-SLM (TDC-SLM) places the DC-MF after frequency domain equalization (FDE) to remove multipath components in a received signal. At a transmitter side, a transmit signal is generated by the summation of an original signal and signals with cyclic shifts. At a receiver side the amounts of the cyclic shifts are detected by using the DC-MF. In this proposed scheme, intervals between the cyclic shifts are designed so that the receiver can distinguish the cyclic shifts and multipath delays with the use of the MF. However, multipath components still deteriorate the accuracy rate of cyclic shift estimation since they generate additional peaks at the outputs of the DC-MF. By using the proposed scheme, the accuracy rate then improves and the bit error rate (BER) reduces as compared to that of the conventional TDC-SLM and DC-MF in [12].
The rest of this paper is organized as follows. Section 1 contains the introduction. Section 2 explains system models including the OFDM symbol structure, time domain cyclic-selective mapping, channel estimation and frequency domain equalization, and the proposed cyclic shift estimation scheme. In Section 3 the performance results of the proposed scheme are presented and finally Section 4 concludes this paper.

System Model
where is the time index, [ ] is the data symbol on the th subcarrier, denotes the subcarrier index, and is the number of the subcarriers. The OFDM signal can also be defined as a vector In order to mitigate the intersymbol interference, a guard interval (GI) is needed. The GI can be obtained by copying the last part of the OFDM signal and adding it to the beginning of the signal.
where GI is the length of the GI.  TDC-SLM scheme is shown in Figure 1. The cyclically shifted signal in the TDC-SLM is given as

Time Domain
where [ ] is the OFDM signal in the time domain at the time index of , GI is the GI length, [ , Δ ] is the SC that is generated by cyclically shifting the OFDM signal by Δ , Δ is the amount of the cyclic shift for the th SC, and Δ ∈ { }, where is an integer [11,12]. The resolution of the cyclic shifts, , has to be large enough for accurate estimation of the cyclic shifts in a receiver. The transmitter combines the SCs to the original signal in the time domain as follows: where [ ] is the transmit signal, is the number of branches, is the th coefficient in the phase sequence, and [ ] is the original signal with the GI. The same set of {Δ } is applied over multiple symbols since the corresponding outputs of the DC-MF are averaged to improve the accuracy of cyclic shift estimation. Thus, the set of {Δ } is selected so that the maximum PAPR over the symbols for averaging is minimized. Here, the PAPR is calculated for each OFDM symbol period.

Channel Estimation and Frequency Domain
where [ ] and [ ] are the th transmitted and received preamble signals in the time domain, respectively. At the receiver, the preamble signal on the th subcarrier is demodulated by taking a discrete Fourier transform (DFT) as The estimation of the channel frequency response on the th subcarrier in the frequency domain,̂[ ], is given as where [ ] is the transmitted preamble symbols on the th subcarrier. Because of the TDC-SLM in the time domain, the channel frequency response needs to be modified during the data period. The superposition of the data sequence works like an artificial multipath on the channel response. To calculate the channel frequency response in the data period, the estimated channel response is converted to the impulse response in the delay domain as follows:ĥ wherê * [ ] denotes the conjugate of the channel frequency response that is obtained from (8) and 2 is the variance of the noise estimated in the receiver. The demodulated signal on the th subcarrier is then where [ ] denotes the signal on the th subcarrier at the receiver.

Cyclic Shift Estimation Scheme.
The DC-MF is applied to the signal in the time domain after the MMSE-FDE to estimate the amounts of the cyclic shifts at the receiver.
The received signal in the time domain can be written as where [ ] is the channel frequency response, [ ] is the signal component, and [ ] is the Gaussian noise on the th subcarrier. Furthermore, [ ] is the th received signal, / 0 is the signal-to-noise ratio per sample, , = exp[(2 / ) ]/ √ , is the size of the DFT, and (⋅) * denotes conjugate. The DC-MF process consists of DC and MF as shown in Figure 2. In the transmitter side, the TDC-SLM generates several SCs by applying cyclic shifts to the original signal after the inverse DFT (IDFT) and generates the transmit signal through the summation of the original signal and the SCs. The DC-MF is utilized to estimate the amount of the cyclic shifts since they are required to recover the transmit signal.
Basically, the DC process multiplies the received signal in the time domain with the conjugate of the GI sequence. The largest peak appears when the last part of the OFDM symbol is multiplied with the conjugate of the GI. The output of the DC is put into the MF to estimate the set of the cyclic shifts, {Δ }, by detecting the second largest peak output. The DC-MF processes with 3 branches are shown in Figure 3. The DC is defined as follows: x x

⋅ ∑
where [ ] = * [ − ] and * [ ] and [ ] are the signal and noise components on the th subcarrier output from the delayed branch of the DC, respectively. After the summation, the outputs of the DC are averaged as follows: where is the number of symbols used in the averaging process. The first peak is caused by the GI and is found by maximization as In addition, the DC produces the correlation between the GI sequence and the received signal with the delay of Δ , that is, whereΔ is the candidate for the amount of the cyclic shift on the th branch. Equation (16) can be rewritten aŝ Therefore, the th peak output of the DC is given aŝ The outputs of the DC are then passed to the MF to estimate the amounts of the cyclic shifts. Figure 4 shows the structure of the MF. In order to reduce the number of the combinations of {Δ }, here, the amounts of the cyclic shifts are selected from every samples as {Δ, Δ + , . . . , Δ + ( − 1) }. The structure of the MF has the delay line in which all the delays are set to ( − 1). The output of the MF is expressed as follows: The cyclic shift of the first branch, Δ, is estimated through maximization as where Δ max is the estimated amount of the cyclic shift for the first branch, Δ. The channel impulse response in (9) is shifted by the estimated cyclic shifts and summed together with the original impulse response as follows: The received signal in the frequency domain after channel compensation is given as follows: where the channel response on the th subcarrier in the data period iŝ[ Table 1 shows the simulation parameters of the proposed scheme which are adopted from LTE parameters. The number of data subcarriers is 128 and each subcarrier is modulated with QPSK. The DFT size is 256 and the length of the GI is set to be 64 samples. The number of symbols for averaging is set to 1, 2, 4, or 8, and the range of the cyclic shift (Δ ) is limited from 60 to 124 The block interleaver with a size of 16 × 8 is applied. The number of branches is 3 and a Barker sequence with a length of 3 ( ) is used as the phase sequence. A uniform delay profile with 6 paths is assumed as the channel model in computer simulation. As a nonlinear high power amplifier (HPA) Rapp's solid state power amplifier (SSPA) model with a knee factor of 3 is assumed. The input back-off (IBO) for the HPA is set to 0, 2, and 4 dB.

PAPR Reduction.
The PAPR performance curves are evaluated in terms of complementary cumulative distribution functions (CCDF). The TDC-SLM assumes = 3 branches with the cyclic shift resolution of = 4 and the number of the symbols for averaging is selected from 1, 2, 4, or 8. From Figure 5, it can be seen that, in comparison with the original signal, the amounts of PAPR reduction in the TDC-SLM for 1, 2, 4, or 8 symbols for averaging are 2.8 dB, 2.7 dB, 2.6 dB, and 2.5 dB, respectively, at the CCDF of 10 −4 . When the number of symbols for averaging is 1, each OFDM symbol has different set of the cyclic shifts and the best PAPR reduction is achieved. On the other hand, the smallest amount of PAPR reduction is realized where 8 OFDM symbols have the same set of the cyclic shifts in order to average the corresponding DC outputs. Nevertheless the difference in the amount of the PAPR reduction for 1 and 8 symbols for averaging is only 0.3 dB at a CCDF of 10 −4 .

Accuracy Rate and BER Performance.
The accuracy rate is the ratio of correct estimation in terms of the amounts of the cyclic shifts in the receiver side. The accuracy rate of the cyclic shift estimation is shown in Figure 6. Here, the uniform delay profile channel is assumed. In Figure 6, the accuracy rate of the proposed scheme for 1, 2, 4, or 8 symbols for averaging is around 57.04%, 87.80%, 98.28%, and 100%, respectively, at / 0 = 5 dB. The multipath channel affects the accuracy rate of the cyclic shift estimation. It is observed that the accuracy rates of the proposed scheme with 8 symbols for averaging are the highest as compared to the others. The accuracy rates of the proposed scheme and the conventional TDC-SLM with 8 symbols for averaging reach 100% at / 0 = 5 and / 0 = 7 dB, for 8 symbols for averaging, respectively. The accuracy rate performance is 2 dB better as compared to that of the conventional TDC-SLM scheme.
The accuracy rate affects the BER performance. The BER performance of the proposed scheme and the conventional TDC-SLM scheme with 8 symbols for averaging is presented   in Figure 7 on the uniform delay profile channel. The difference between the BERs with perfect estimation and the proposed scheme with 8 symbols for averaging is 0.6 dB and it is 0.2 dB better than that of the conventional TDC-SLM scheme.
The BER and the accuracy rate with the HPA are also evaluated. Figure 8 presents the comparison of the accuracy rates between the proposed scheme and the conventional TDC-SLM. The number of the symbols for averaging is set to 8. The accuracy rate of the propose scheme reaches 100% when / 0 = 10 dB, 11 dB, and 13 dB for IBO of 4 dB, 2 dB, and 0 dB, respectively. On the other hand, the accuracy rate of the conventional TDC-SLM realizes 100% at / 0 = 11 dB, 12 dB, and 14 dB for IBO of 4 dB, 2 dB, and 0 dB, respectively. It shows that the accuracy rate of the proposed scheme improves when the DC-MF is placed after FDE-MMSE even though the nonlinearity of the HPA is assumed. The accuracy rate performance is 1 dB better as compared to that of the conventional TDC-SLM scheme. The BERs for the different values of the IBO are also evaluated as depicted in Figure 9. In the conventional TDC-SLM, the required / 0 values at a BER of 10 −3 are 13 dB, 13.3 dB, and 13.8 dB, for IBO of 4 dB, 2 dB, and 0 dB, respectively. On the other hand, with the proposed scheme, it is 12.8 dB, 13.1 dB, and 13.3 dB for IBO of 4 dB, 2 dB, and 0 dB, respectively. It is observed that the BER improves by increasing the IBO. The BER differences between the conventional TDC-SLM and the proposed scheme are 0.4 dB, 0.3 dB, and 0.4 dB at a BER of 10 −3 for IBO of 4 dB, 2 dB, and 0 dB, respectively. The proposed scheme is proven to improve the accuracy rate and the BER performance as they approach the values with perfect estimation.

Conclusions
In this paper, the SI detection scheme for the TDC-SLM has been proposed. The DC-MF is implemented after the MMSE detection to remove the effect of the multipath channel. The amount of PAPR reduction with the TDC-SLM is around 2.5 dB as compared with that of the original signal for 8 symbols for averaging when the resolution of the cyclic shift is = 4 samples. The accuracy rate of the proposed scheme reaches 100% at / 0 = 5 dB and the BER difference with 8 symbols for averaging is around 0.6 dB as compared to that with the perfect estimation of the SI. The BER is 0.2 dB better than that of the conventional TDC-SLM scheme. Under the nonlinearity of the HPA, the proposed scheme still improves the BER performance by around 0.4 dB at a BER of 10 −3 for IBO of 4 dB, 2 dB, and 0 dB.