AMultistandard Frequency Offset Synchronization Scheme for 802 . 11 n , 802 . 16 d , LTE , and DVB-T / H Systems

Carrier frequency offset (CFO) synchronization is a crucial issue in the implementation of orthogonal frequency division multiplexing (OFDM) systems. Since current technology tends to implement different standards in the same wireless device, a common frequency synchronization structure is desirable. Knowledge of the physical frame and performance and cost system requirements are needed to choose the most suitable scheme. This paper analyzes the performance and FPGA resource requirements of several data-aided (DA) and decision-directed (DD) schemes for four wireless standards: 802.11n, 802.16d, LTE, and DVB-T/H. Performance results of the different methods are shown as BER plots and their resource requirements are evaluated in terms of the number of computations and operators that are needed for each scheme. As a result, a common architecture for the four standards is proposed. It improves the overall performance of the best of the schemes when the four standards are considered while reducing the required resources by 50%.


Introduction
OFDM has been the focus of a wide variety of studies in wireless communication systems because of its high transmission capability and its robustness to the effects of frequency-selective multipath channels.Several existing and upcoming standards, among them are WiFi 802.11n [1], WiMAX 802.16d [2], LTE [3], and DVB-T/H [4,5], are based on the OFDM concept.It is expected that several of them will coexist and, in many cases, operate concurrently on the same wireless terminal.This opens up for receiver/transmitter algorithm design where the basic algorithm structure is shared between the different OFDMbased standards, allowing for both efficient implementations and efficient use of resources on a common baseband processing platform.Several approaches to multistandard solutions can be found in the literature [6][7][8][9], but none of them deals with the synchronization problem in detail.
It is well known that OFDM systems are more sensitive to an offset in the carrier frequency than single carrier schemes at the same bit rate.This CFO causes loss of orthogonality of the multiplexed signals creating intercarrier interference (ICI) and introducing a constant increment in the phase of the samples.
Frequency synchronization is often performed in two phases: acquisition and tracking.At the start of the sequence the acquisition stage is used to perform a first estimation of the CFO of the signal [10][11][12][13][14].In a circuit-switched system the acquisition phase can be fairly long since it only represents a small percentage of the total transmitted sequence.Some systems like LTE, DVB, and cellular systems are circuitswitched.In packet-switched systems, as 802.16d and 802.11n, the acquisition phase is more important since the transmission sequences are short.The most common approach in such systems is to use a preamble for acquisition.As it will be shown, the acquisition stage is a well-defined task that can be easily adapted to all standards being considered.Therefore, the paper focuses specially on the tracking stage.
After acquisition, the problem of tracking has to be solved.Since acquisition is never performed perfectly and conditions are not static in a real system, there still remains a residual CFO that needs to be corrected.The tracking stage can be non-data-aided [15], when no extra information is included in the transmitted data (as in DD methods) or data aided [12,16], when periodically transmitted training symbols and/or known pilot subcarriers are used.
In this paper, different frequency synchronization schemes are evaluated for the addressed standards with an explicit aim to reuse as much as possible the algorithm structure when switching between standards because of the limited resources available in the target architecture.Therefore, algorithm and architectural design are approached together from the beginning of the design flow.In this study, FPGAs have been selected as target architecture for these systems because of their support for reconfigurability, parallelism, and increased performance over software-based (e.g., DSP) solutions.
The main contributions of this paper are as following: (1) detailed performance analysis of CFO synchronization schemes (mainstream and alternative) for four current wireless communications standards, (2) comparative evaluation of their computational requirements, (3) proposal of feasible architectures for multistandard devices.
The paper is structured as follows.The OFDM signal and the different standard frames are introduced in Sections 2 and 3.The acquisition and the different tracking schemes are presented in Sections 4 and 5. BER results for the different standards are given in Section 6. Implementation issues are considered in Section 7. Finally, Section 8 concludes the paper.

The OFDM Signal
The baseband scheme of a digitally implemented OFDM transmission system with CFO correction enabled is provided in Figure 1.Considering an OFDM system, the data source emits symbols (d i ) which belong to a BPSK, QPSK, 16-QAM, or 64-QAM constellation and are assumed to be equiprobable and statistically independent.The sequence d i is serial to parallel converted into blocks of N symbols (d k,l denotes the k th symbol of l th block where k = 0, . . ., N − 1, and l = −∞, . . ., +∞).These blocks are generated with period T s = T + T g (T: useful period, T g : guard interval).After the inverse FFT (IFFT) is applied to each block with period T s , a cyclic prefix (CP) is inserted by prefixing the resulting N samples (s n,l , k = 0, . . ., N − 1) with a replica of the last N g samples.Thus, each block is made of N s = N + N g samples called an "OFDM symbol".
Since the carrier frequency difference between the transmitter and the receiver Δ f can be modeled as a timevariant phase offset, e j2πΔ f t , the received OFDM signal can be represented as where w(t) is the additive white Gaussian noise (AWGN), s(t) is the transmitted baseband OFDM signal, h(t, τ) is the channel impulse response with τ being the delay spread, and " * " denotes linear convolution.
Assuming that r(t) is sampled at the transmit interval T with perfect timing, the samples blocked for the lth FFT are (2) The resulting samples from the FFT obtained in (2) are [17] c k,l = e jπ((N−1)/N)ε e j2π((lNs+Ng )/N)ε sin(πε) where ε=ΔfT is the CFO normalized with respect to the subcarrier spacing.Likewise, H k,l is the channel coefficient on the k th subcarrier with the assumption that the channel is stationary during at least one symbol, ICI k,l is the intercarrier interference noise due to loss of orthogonality and, W k,l is a zero-mean stationary complex process.The first term is the data value d k,l modified by the channel transfer function, experiencing an amplitude reduction and phase shift due to the frequency offset.

The Standard Frames
The  frequency subcarriers are null.The shape of the WiMAX OFDM signal in the frequency domain is shown in Figure 3. LTE is a project belonging to the Third Generation Partnership Project (3GPP) to improve the Universal Mobile Telecommunications Systems (UMTS) and to cope with future communications requirements.LTE uses OFDM in the downlink which results in high spectral efficiency.It is also designed to be flexible in the channel allocation.In contrast to packet-oriented networks, LTE does not include a preamble to facilitate timing and frequency synchronization.Instead, pilot subcarriers are embedded in the frame as shown in Figure 4.In the normal mode, pilot subcarriers are transmitted every six subcarriers during the first and fifth OFDM symbols of each slot.This paper deals exclusively with the Frequency Division Duplex (FDD) mode defined in the standard.
Systems using DVB standards focus on digital television and data services.Even though the DVB-T standard is prepared for mobile reception, there are some factors that have to be considered when the end device is running under limited power constraints.This was the major motivation to develop a new broadcast standard aimed for handheld devices.This standard is denoted as DVB-H.It contains two major additions to the DVB-T standard, namely, time slicing and a new mode of operation called 4K.However, the physical frame has the same structure as in DVB-T.Therefore, similar synchronization schemes can be performed for both standards.DVB-H specifies three possible OFDM modes (2K, 4K, and 8K).As with LTE, DVB-T/H does not include a preamble for timing and frequency synchronization purposes.It defines dedicated synchronization subcarriers embedded into the OFDM data stream: continual (periodicity in the time domain) and scattered pilot subcarriers (periodicity in the frequency domain).Both continual and scattered pilots are transmitted at a boosted power level and their position can be observed in Figure 5.
In order to choose a suitable frequency synchronization scheme, special attention must be paid to the reference OFDM symbols and pilot subcarriers.In 802.11n and 802.16d there is a preamble amended at the beginning of the frame, whereas in LTE and DVB-T/H there is no preamble.Therefore, correlation properties introduced by the CP should be used in the acquisition stage for these two standards.Continual pilot subcarriers are defined in 802.11n, 802.16d, and DVB-T/H but LTE only includes pilot subcarriers at some specific OFDM symbols.Thus, data-aided tracking performance would perform better in 802.11n, 802.16d, and DVB-T/H than in LTE if the pilots are used for tracking purposes.From these observations, it seems that using a decision-directed algorithm in the

Boosted pilot Data
Frequency (subcarriers) tracking stage would lead to a more homogeneous approach in a multistandard system.

CFO Acquisition Schemes
Most of the solutions for acquisition use the aid of pilot symbols, which are assumed to be known at the receiver.An alternative technique is to use the redundant information included in the CP [10].Furthermore, CFO acquisition can be divided in two steps as explained in [11,12].In the first step, the fractional part of the CFO is estimated and corrected, allowing for the integer part of the CFO to be estimated and corrected in the second step.The 802.11n and 802.16d standards include a preamble at the beginning of the frame.This preamble has an OFDM symbol with a repeated pattern in the time domain.The Moose algorithm [13] can be used to perform the fractional acquisition stage by using this symbol.Let there be L complex samples in each half of the training symbol, and let the correlation parts be Considering the LTF symbol, for example, where the first half is identical to the second one (in time order), except for a phase shift caused by the carrier frequency offset, then the normalized frequency offset estimate is φ = angle(P). (5) Subcarrier spacing for 802.11n is 312.5 KHz.Assuming a 25 ppm local oscillator and a carrier frequency of 2.4 GHz, the signal can experience a CFO of less than ±0. 6times the subcarrier spacing.Thus, the integer estimation of the CFO can be avoided.Similar calculations and conclusions can be obtained for 802.16d.
LTE does not include a preamble in its frame, so a blind method should be used to accomplish CFO acquisition.Subcarrier spacing in LTE systems is 15 KHz; thus, normalized CFO can be higher than one.According to [12], first the fractional part of the CFO can be estimated by using the CP allocated in the OFDM symbol as shown in ( 4) and ( 5), where r m and r m+L are now the cyclic prefix and its copy, and L = N.After that, integer estimation can be performed in the frequency domain by using a modification of the algorithm described in [12]: where cp l,k are the received pilot subcarriers inserted in the lth OFDM symbol, p l,k are the known values of the pilot subcarriers, and I is determined from [−n max , n max ].Due to the LTE pilot subcarrier structure, n max = 5.By using the known values of the pilot subcarriers in (7), the integer part of the CFO can be calculated using only the first OFDM symbol (l = 1).
The DVB-T/H frame does not include a preamble and it also has pilot subcarriers in the first OFDM symbol likewise LTE, so a similar approach to LTE acquisition can be used.The main difference between integer estimation in LTE and DVB-T/H is the length of the cyclic prefix and the number of pilot subcarriers that can vary depending on the transmission mode, thus increasing or decreasing the CFO estimation performance and its computational complexity.
It can be concluded that the same algorithm ( 4) and ( 5) can be applied in the four standards for fractional CFO acquisition by using the CP or the available preamble, whereas a similar method ( 6) and ( 7) can be used for integer acquisition in LTE and DVB-T/H where it is needed.Since algorithm reuse can be accomplished easily in the acquisition stage, the rest of the paper will focus on the tracking stage.

CFO Tracking Schemes
After acquisition, there still remains a little variation in the residual CFO.If that variation is not tracked and corrected, constellation points will fall in a different quadrant after a number of OFDM symbols, thus significantly degrading the system performance.For example, a residual CFO = 0.02 introduces a subcarrier rotation of 22 • after three OFDM symbols for a DVB 2K mode with CP = 64 and QPSK constellation.Thus, accuracy and speed of convergence are important when implementing the CFO tracking closed loop.Although, this residual CFO also introduces ICI, it can be considered negligible in most cases, depending on the conditions and specifications.Therefore, the tracking effort should be aimed at correcting CFO rotation.
It should be mentioned that for DVB-T/H, channel estimation and equalization could be performed during all the data transmission by using the continual pilot subcarriers.This equalization would also correct partially the residual CFO rotation.However, even for this standard a residual CFO tracking scheme is highly recommended [12].The CFO tracking scheme will be more critical for packetswitched systems, as 802.16d and 802.11n,where channel estimation is performed only at the beginning of the frame by using the preamble.
The so-called decision-directed methods (non-dataaided methods) compare the received data subcarriers with sliced versions (as fed from the demapper) to give a larger number of estimates.The Decision-Directed Time-Frequency Loop (DD-TFL) proposed in [15] for CFO tracking in the 802.11g standard is based on two feedback loops in the time and the frequency domain and it uses all the data subcarriers to perform the estimations.Adaptations of this scheme for the 802.16d standard are found in [16] where the Decision-Directed Frequency Loop (DD-FL) and Data-Aided Frequency Loop (DA-FL) schemes are presented.DD-FL avoids the use of the time loop and uses less number of subcarriers per symbol to perform the tracking stage.By using DA-FL, the pilot subcarriers inserted in the data stream are used instead of the data subcarriers to perform the CFO estimations.DA-FL and DD-FL aim at reducing the CFO tracking computational complexity with almost no performance penalty.Other CFO tracking methods can be found in the literature as the classical scheme presented in [12].However this DA tracking scheme requires pilot subcarriers in two consecutive OFDM symbols and this condition is not met by LTE.Therefore, this method is not considered in this work.
DA-FL, DD-FL, and DD-TFL can be adapted to other standard frames.The 802.11n, 802.16d, and DVB-T/H frames include pilot subcarriers in every OFDM symbol, whereas LTE includes pilot subcarriers in some specific symbols.Therefore, DA-FL performance is expected to worsen for this standard.
The DA-FL scheme [16] uses pilot subcarriers inserted in the OFDM data symbols.Its structure is represented in Figure 6.
The sequence c k,l after the FFT at the receiver is modified at every subcarrier as The corrected data symbols c k,l may then be demapped to a bit stream.In the phase error detector (PED), the subcarrier pilots, p k,l , are used for extracting the error increment E k,l according to one of the algorithms proposed in [18].In particular, the algorithm selected here to extract the error increment computes where p k,l are the known values of the pilot subcarriers and sgn() is the sign function.After error extraction, the error increment E k,l is attenuated and enters the filter directly.
Then, the estimated phase error Ψ k,l is applied to the post-FFT data symbol c k,l .Therefore, CFO correction is updated as many times as pilot subcarriers are inserted in the OFDM symbol.Since this scheme performs correction in the frequency domain, it corrects the phase rotation and not the ICI introduced by the CFO.An important point to remark is that by using algorithm described in (9,10,11) no complex multiplications are needed.This is an important improvement over classical tracking schemes as in [12].
The structure of the DD-FL scheme [16] is represented in Figure 7.This scheme also uses the error extraction algorithm described by (9).These equations are adapted to a decision-directed scheme by substituting the pilot subcarriers data (p k,l ) by the data samples, and the known value of the pilot subcarriers (p k,l ) by the samples at the output of the decisor.
The DD-TFL scheme [15] is composed of two tracking loops as it can be observed in Figure 8.The frequency loop uses the information provided by the output of the decisor to build the tracking system.In the time loop, the error E k,l estimated by the decision-directed phase error detector (DD-PED) is fed to the time branch and is averaged before entering the filter.As a result, the pre-FFT sample r n,l is rotated as This time branch is able to correct the ICI introduced by the residual CFO; thus, a better performance is expected when compared to DD-FL.
These tracking schemes can be used on the four standards.The two DD methods can use all or some of the available data subcarriers to perform the tracking.In this work, all data subcarriers are used for 802.11n, eight data subcarriers are used for 802.16d, every 6th subcarrier is used for LTE, and every 38th subcarrier is used for DVB-H/D.By choosing these values, simulations provide meaningful results and simulation times are not prohibitive.The DA method uses all the pilot subcarriers available in the frame.

BER Results
BER results for the complete synchronization system are obtained for each standard.A Rayleigh channel consisting of two paths is considered.The channel is perfectly estimated at the receiver and it is corrected using zero-forcing equalization.There is no coding of the QPSK signal, so performance   of the different schemes is shown through raw BER values.It is assumed that timing synchronization is perfectly achieved.The BER values are calculated by averaging the error bits throughout 10000 frames.
The 802.11n frame is simulated considering a system with a 64-point FFT and four pilots per symbol.The CP is composed of 16 samples.Each frame is composed of 100 OFDM symbols and the normalized CFO introduced in the system is 0.6.Similar length frame and normalized CFO are used for 802.16d.This standard requires a 256-point FFT and N g = 32 is used.In the case of LTE, the frame is composed of 140 OFDM symbols with an FFT size of 512, N g = 64 and CFO = 2.7.Finally, in the case of DVB-H, the frame is composed of 40 OFDM symbols with a 4048point FFT, N g = 128, and a normalized frequency offset of 2.7.Table 1 summarizes the chosen parameters for the different standards.First of all, some previous simulations were performed to find the appropriate attenuation (α T , α F ) of the filters of the loops.Table 2 collects the values finally selected.Once the optimum attenuation values for the different schemes and standards were found, the BER results were obtained for a system where both CFO acquisition and tracking were enabled.Acquisition was performed for each standard as explained in Section 4, whereas three different tracking schemes (DA-FL, DD-FL, and DD-TFL) were evaluated for each standard.
Figure 9 shows the BER results for 802.11n.DA-FL obtains the best response and, for low noise values, DD-FL and DD-TFL approximate to the offset free case as well.This is because DD schemes rely on hits in the decisor block to work correctly.Hence, when noise decreases and less errors occur at the decisor, DD performance increases.
Figure 10 displays the results for 802.16d.The DA-FL scheme improves the BER obtained by the DD schemes.In a similar way to 802.11n, the DD schemes approximate to DA-FL performance when the noise decreases.It is possible to improve DD performance in this case by increasing the number of data subcarriers used in the tracking estimation.However, this would also increase the computational requirements.
Figure 11 shows the plot for LTE.It can be observed that DA-FL performance is unacceptable, while DD schemes obtain BER values close to the offset free case.This is because there are no pilots inserted in every OFDM symbol, so tracking convergence is not fast enough for DA-FL.Thus, this standard encourages the use of DD methods.As it was e − j(n+Ng +lNs)Ψ  expected, DD-TFL behaves better than DD-FL although the difference is small.Figure 12 displays the results for DVB-T/H.The DA-FL scheme clearly outperforms the DD schemes.That is due to the "small" number of data subcarriers used for CFO tracking.It is possible to improve the DD performance, similarly to 802.16d and LTE by increasing the number of data subcarriers and the computational complexity.
Therefore, from the previous performance results it can be concluded that DD-TFL is the best option for a common implementation for the three standards since it improves slightly the DD-FL performance and DA-TL has an unacceptable performance for LTE.

Implementation Issues
The BER performance of the different schemes has been shown in Section 6.However, there still remains an important issue that needs to be considered for implementation purposes: their computational complexity.This is a key issue when determining the number of hardware resources needed for portable, battery-powered systems.Computations are described in terms of real multiplications (M), additions (S), and multiplications by a constant (MC).A complex multiplication is implemented using 3 M and 5 S. CFO correction is implemented through a complex multiplication.On the  other hand, the required FPGA resources are described in terms of embedded multipliers and adders (EM/A).Table 3 describes the three synchronization schemes for each standard according to their (M/S/MC) computations, as millions of operations per second, and their required (EM/A) resources.The computations per second are calculated taking into account the operations performed by each method, including the algorithm, the filter, and the correction, and considering the bit rates defined in the standards.The required resources are obtained by scheduling the operations involved assuming that they are performed iteratively subcarrier by subcarrier.No other sharing of resources has been considered in the architecture.It can be observed that DA-FL and DD-FL need less than a half of the number of operations required by DD-TFL.Therefore, DD-TFL not only would require more resources, but also would consume more power.In this framework, a new analysis of the results obtained in Section 6 reveals that the advantage of DD-TFL over DD-FL can be considered negligible.It is also important to note that DD-FL and DD-TFL will increase or reduce their computations (and also their performance) depending on the actual number of data subcarriers in the OFDM symbol.Therefore, when considering computational requirements in addition to performance, it turns out that the best alternative is DD-FL.
Nevertheless, an even better solution can be found by looking at the structure of the three tracking schemes.Since DA-FL and DD-FL use the same estimation algorithm, both schemes can be implemented using the same resources and work for the four different frames (DA-FL for 802.11n, 802.16d, and DVB-T/H, and DD-FL for LTE).To accomplish that, only two memories with the number and position of the pilot or data subcarriers involved in the tracking are needed to switch between DA-FL and DD-FL.This solution also offers more possibilities to reuse the EMs available in the FPGA.Table 4 summarizes the three possible multistandard solutions considering that the target device is a Virtex 4 xc4vlx60 which contains 66 EM.For each solution, it includes the percentage of EMs used in the FPGA, the resource utilization (RE) described as a percentage of the total time, and the range of signal losses in dB for a target BER = 10 −4 .In the case of resource utilization, the percentages are obtained from the ratio of the subcarriers that are being used to calculate the CFO estimates with respect to the total number of subcarriers available in each OFDM symbol.These percentages somehow describe the possibilities of further resource reuse.Some values in the table are given as ranges that include the results for the four standards being evaluated.For example, the DD-FL solution allows a 2% resource utilization for DVB-T/H, 3% for 802.16d, 10% for LTE, and 59% for 802.11n.

Conclusions
In this work, a comparison of different frequency synchronization schemes for four wireless communications standards (802.11n,802.16d,LTE, and DVB-T/H) has been presented, aimed at a multistandard FPGA implementation.Focus is on the tracking stage, as acquisition is performed using the same algorithm for 802.11n, 802.16d,LTE, and DVB-T/H.In the case of 802.11n and 802.16d,only fractional CFO acquisition is performed over the preamble.
Despite the frame differences between the standards, three different methods to accomplish CFO tracking have been evaluated.DA-FL performs well for 802.11n, 802.16d, and DVB-H/T.However, DA-FL performance for LTE is unacceptable due to the fact that no pilot subcarriers are inserted at each OFDM symbol.DD-TFL is the scheme with best performance for the four standards but, after analyzing the computational requirements and the possibilities of resource reuse, DD-FL appears as a more balanced solution.Furthermore, a solution that combines DA-FL for 802.11n, 802.16d and DVB-T/H standards and DD-FL for LTE by including a small additional memory to switch between standards has been proposed, showing overall better performance than DD-TFL and requiring only half of its resources.
× 128 pattern symbol.This work will focus on the uplink frame.Eight boosted subcarriers are allocated for pilot signals and a number of the highest and lowest

Table 1 :
Parameters for the different standards.

Table 3 :
Number of operations and resources.

Table 4 :
Features of the three solutions.