This paper presents an improved decision feedforward equalizer (DFFE) for high speed receivers in the presence of highly dispersive channels. This decision-aided equalizer technique has been recently proposed for multigigabit communication receivers, where the use of parallel processing is mandatory. Well-known parallel architectures for the typical decision feedback equalizer (DFE) have a complexity that grows exponentially with the channel memory. Instead, the new DFFE avoids that exponential increase in complexity by using tentative decisions to cancel iteratively the intersymbol interference (ISI). Here, we demostrate that the DFFE not only allows to obtain a similar performance to the typical DFE but it also reduces the compelxity in channels with large memory. Additionally, we propose a theoretical approximation for the error probability in each iteration. In fact, when the number of iteration increases, the error probability in the DFFE tends to approach the DFE. These benefits make the DFFE an excellent choice for the next generation of high-speed receivers.
1. Introduction
Future generation of communication systems will operate at multigigabit-per-second data rates on highly dispersive channels [1, 2]. In commercial applications, the digital receiver is often implemented as a monolithic chip in CMOS technology [1]. Maximum clock frequency of state-of-the-art complex digital signal processors in 28 nm CMOS technology is limited to frequencies lower than 1 GHz. Therefore, in order to achieve multigigabit-per-second data rates, parallel processing techniques are required [1].
Maximum likelihood sequence detection (MLSD) and decision feedback equalization (DFE) are two efficient techniques used to compensate the high ISI introduced by such channels as the ones described in [3]. The complexity of the former grows exponentially with the channel memory, regardless of whether parallel processing is used or not. As for the latter, although the complexity of serial implementations grows linearly with channel memory, all presently known parallel processing implementations require that the bottleneck created by the feedback loop be broken using techniques like the ones proposed by [4–6], whose complexity again grows exponentially with the channel memory.
Some algorithms to deal with the drawbacks of the DFE in high-speed applications and parallel processing have been proposed by [4–12]. For example, parallel DFE architectures based on look-ahead pipelined multiplexer loops have been introduced in [6, 7]. These architectures can mitigate the speed limitation of feedback loops by using nested multiplexer loops where the implementation is reported in [10]. Some further improvements to these schemes have been proposed in [8, 9]. However, the implementation complexity of DFE parallel architectures based on look-ahead pipelined multiplexer loops still increases exponentially with the number of feedback taps. Recent works [11, 12] present the concurrent look-ahead technique for high-speed data rate. This scheme reduces the hardware complexity in comparison with a look-ahead pipelined multiplexer loops technique, but the decision loop is not broken.
Iterative interference cancellation and turbo equalization have received increasing attention in recent years [13]. For example, iterative cancellation is proposed in [14–17] where nonlinear equalizers for ISI channels are introduced. This technique uses an iterative algorithm to successively cancel ISI from a block of received data. The algorithm generates symbol decisions whose reliability increases monotonically with each iteration. According to these authors, so far these techniques have not been applied to create efficient pipelined and parallel-processing implementations of equalizer structures for ultra-high-speed applications despite its interesting characteristics. Therefore, the application of both DFE and MLSD is limited to moderate ISI channels. As a consequence, there is a need for reduced-complexity receivers which can operate efficiently on channels with large ISI.
A preliminary study of a new low-complexity iterative equalization architecture for high-speed receivers is introduced in [18]. The decision feedforward equalizer (DFFE) allows to obtain similar performance to DFE with a parallelizable architecture, whose complexity increases only quadratically with the channel memory. For channels with large ISI this results in a dramatic complexity reduction if compared with DFE. The central idea behind DFFE is the iteration of tentative decisions to improve the accuracy of the ISI estimation. We would like to highlight that tentative decisions have been used in the past to cancel FEXT interference [19].
Finally, the error probability in the DFE has been widely discussed in the literature with numerous authors who develop different methods to estimate the error probability in DFE [20–24].
In this work, we explain the concept of DFFE and its implementation complexity to parallel architectures. Moreover, we propose a theoretical approximation for the error probability in each iteration, where it is easy to appreciate that when the number of iteration increases the error probability in the DFFE tends to approach the DFE.
This paper is organized as follows. The concept of DFFE is explained in Section 2. In Section 3 the performance evaluation is researched. Section 4 analyzes parallel architectures for DFFE and implementation complexity. Finally, conclusions are drawn in Section 5.
2. Decision Feedforward Equalization (DFFE)
To begin with, we will explain the concept of DFFE. For simplicity, we only consider a dispersive channel with postcursor ISI. Our results can be generalized to channels with both pre- and postcursor ISIs by combining the DFFE with a feedforward equalizer [3]. Let yn, a~^n(i), and L be the DFFE input sample, the tentative decision at the ith iteration, and the memory of the channel, respectively. At the first iteration, i=0, we get the first tentative decision without any cancellation of interference:
(1)a~^n(0)=𝒬(yn),
where 𝒬(·) is the slicer function. This tentative decision can be then used to cancel the postcursor ISI introduced by the first past symbol and thus to improve the accuracy of the detection. By using proper time delays, we can obtain the tentative decision at the second iteration as follows:
(2)a~^n(1)=𝒬(yn-f1(a~^n-1(0))),
where fk(·) with 0<k<L denotes the partial postcursor ISI caused by the past k symbols. This process is repeated at least until L consecutive tentative decisions are available. At this point, a final decision can be obtained from
(3)a^n=a~^n(L)=𝒬(yn-fL(a~^n-1(L-1),…,a~^n-L(0))),
where fL(·) is the total postcursor ISI of the channel. Based on an information theory metric [25], in this work we show that the reliability of the tentative decision a~^n(i) improves as the number of iteration i grows. In this way, both the accuracy of the interference estimate and the performance of the DFFE are improved with the number of iterations. Numerical results derived from computer simulations demonstrate that the DFFE can achieve performance similar to the DFE on highly dispersive channels. Furthermore, since tentative decisions are used instead of final decisions to estimate the postcursor ISI, it is possible to implement the DFFE in a feedforward way, which leads to a direct parallel implementation. We show that the computational complexity of the DFFE grows quadratically with L. This results in a drastic complexity reduction in comparison to parallel architectures for the DFE where the computational load grows exponentially with L. This favorable tradeoff between performance and complexity makes the DFFE an excellent alternative for implementing high-speed receivers in transmissions over highly dispersive channels.
As we expressed above, the iterative use of tentative decisions to estimate the postcursor ISI is the key to DFFE. In the following section, we use the mutual information [25] to show how the iterations impact the reliability of the tentative decisions. In addition, we study the DFFE performance in transmissions over channels with high memory.
2.1. Architecture of DFFE
The received sample is given by
(4)yn=an+∑k=1Lan-kdk+zn,
where dk with k=1,…,L is the postcursor ISI tap, an is the transmitted symbol (e.g., an∈{±1}), and zn is white Gaussian noise with power σ2. Assuming that the channel is known at the receiver (i.e., perfect channel estimation), the detected symbol provided by the DFFE at instant n given by (3) can be rewritten as
(5)a^n=a~^n(R-1)=𝒬(yn-∑k=1La~^n-k(R-1-k)dk),
where R with R>L is the total number of iterations. The first L tentative decisions are calculated iteratively as follows:
(6)a~^n(i)=𝒬(yn-∑k=1ia~^n-k(i-k)dk),1≤i<L,
with a~^n(0)=𝒬(yn) for i=0.
Figure 1 shows the architecture of the DFFE for a channel with memory L=3 and R=5. Note that the final decision a^n=a~^n(4) uses past tentative decisions to estimate the postcursor interference, and not previous final decisions as in the DFE. As we will show later, this fact allows the direct parallel implementation of the DFFE.
Example of a 3-tap DFFE (L=3) with R=5. (a) Note that the latency between the input signal and the decision is R-1. (b) The red dashed line denotes the critical path (see Section 4.4).
2.2. Reliability of the Tentative Decisions
Next, we analyze the mutual information between the transmit symbol an and the tentative decision at the ith iteration, a~^n(i), defined by
(7)I(an,a~^n(i))=H(an)-H(an∣a~^n(i)),
where H(·) and H(·∣·) denote entropy and conditional entropy, respectively [25]. Note that I(an,a~^n(i)) is the information on an contained in a~^n(i). For example, for binary transmit symbols, I(an,a~^n(i))=1 indicates that no error occurs in the tentative decisions (i.e., Pr{a~^n(i)=an}=1). On the other hand, in the presence of a high error rate in the tentative decisions (i.e., Pr{a~^n(i)≠an}=1), the mutual information gets I(an,a~^n(i))=0. Thus, it can be concluded that the mutual information (7) provides a measure of the reliability of the tentative decision a~^n(i).
2.3. Numerical Results
Figure 2(a) depicts the mutual information versus the signal-to-noise ratio (SNR), defined as SNR=E{|an|2}/σ2. We consider an∈{±1} and a postcursor ISI channel modeled as
(8)dk={αk0<k≤L,0otherwise,
with α being a positive number smaller than one. In Figure 2(a) we consider α=0.6 with L=10 and a DFFE with R=11. Notice that the mutual information grows as the SNR increases; in a limit case, note that I(an,a~^n(R-1))=I(an,a^n)→1 for SNR→∞. For a given value of SNR, note that the minimum mutual information (or reliability) is verified at the first iteration (i=0). This can be understood from (1) in which it is observed that the first tentative decision is obtained directly from the received sample without any cancellation of interference. Nevertheless, although the reliability of a~^n(0) is low, some information of the transmit symbol an is contained in a~^n(0). More precisely, this fact is exploited in the second iteration (i=1), in which it is observed that the reliability of a~^n(1) has been improved as a result of the partial cancellation of the postcursor ISI caused by a~^n(0). This process is repeated in the following iterations until the last iteration i=R-1 is reached. At this point, the DFFE is able to provide the final decision a^n=a~^n(R-1) with a high reliability.
Reliability of the DFFE tentative decisions. (a) Mutual information versus SNR for α=0.6, L=10, and R=11. (b) Mutual information versus number of iterations for different postcursor channels with SNR=15dB.
Figure 2(b) shows the mutual information versus the number of iterations for several postcursor ISI channels with SNR=15dB. We use α=0.6, 0.82, and 0.92 with L=10, 30, and 60, respectively. In all cases, it can be observed that the reliability of the tentative decisions improves with the number of iterations. In particular, note that the reliability of the DFFE decisions at R>L tends to reach that of the DFE. This result suggests that the performances of the DFE and the DFFE with R>L iterations should be similar.
3. Performance Evaluation
From (4) and (5), the slicer input signal at the ith iteration, yn(i), can be expressed as
(9)yn(i)={an+∑k=1Lan-kdk+zn,i=0,an+∑k=1Lan-kdk-∑k=1ia~^n-k(i-k)dk+zn,0<i<L,an+∑k=1Lan-kdk-∑k=1La~^n-k(i-k)dk+zn,i≥L.
Let Ψn(i) be the DFFE-state vector at the ith iteration defined by
(10)Ψn(i)={(an-1,an-2,…,an-L),i=0,(an-1,an-2,…,an-L,a~^n-1(i-1),a~^n-2(i-2),…,a~^n-i(0)),0<i<L,(an-1,an-2,…,an-L,a~^n-1(i-1),a~^n-2(i-2),….,a~^n-L(0)),i≥L.
Let Ni denote the dimension of the state vector Ψn(i). Thus, observe that
(11)Ψn(i)∈{ψ(i,0),ψ(i,1),…,ψ(i,2Ni-1)},
where ψ(i,0)=(+1,+1,…,+1), ψ(i,1)=(+1,+1,…,-1),…, ψ(i,2Ni-1)=(-1,-1,…,-1), are Ni-dimensional vectors. The slicer input signal at the ith iteration given by (9) can be rewritten as
(12)yn(i)=g(an,Ψn(i))+zn,
where
(13)g(an,Ψn(i))={an+∑k=1Lan-kdk,i=0,an+∑k=1Lan-kdk-∑k=1ia~^n-k(i-k)dk,0<i<L,an+∑k=1Lan-kdk-∑k=1La~^n-k(i-k)dk,i≥L.
Then, the probability density function (pdf) given the transmit symbol an can be expressed as
(14)fy∣a(yn(i)∣an)=∑k=02Ni-1fy∣a,Ψ(yn(i)∣an,ψ(i,k))P(ψ(i,k)),
where P(ψ(i,k))=Pr{Ψn(i)=ψ(i,k)} and
(15)fy∣a,Ψ(yn(i)∣an,ψ(i,k))=12πσe-(1/2σ2)(yn(i)-g(an,ψ(i,k)))2.
The symbol error probability at the ith iteration is
(16)Pe(i)=Pr{yn(i)<0∣an=+1}Pr{an=+1}+Pr{yn(i)≥0∣an=-1}Pr{an=-1}.
Note that Pr{yn(i)<0∣an=+1} and Pr{yn(i)≥0∣an=-1} can be computed by using the pdf given by (14).
3.1. Example
In the following equations we consider a postcursor channel with L=1 and d1=1 (i.e., a duobinary channel). At the first iteration, we get
(17)Ψn(0)=(an-1),(18)g(an,Ψn(0))=an+an-1.
Note that Ni=1 and
(19)Ψn(0)∈{ψ(0,0),ψ(0,1)}
with ψ(0,0)=(+1) and ψ(0,1)=(-1). The transmit symbols are assumed independent and identically distributed with
(20)Pr{an=+1}=Pr{an=-1}=12∀n.
In this situation, from (17) and (19) note that
(21)P(ψ(0,k))=12,k=0,1.
The error probability Pe(0) can be derived from (16) and
(22)fy∣a(yn(0)∣an)=12∑k=01fy∣a,Ψ(yn(0)∣an,ψ(0,k)).
At the second iteration, we get
(23)Ψn(1)=(an-1,a~^n-1(0)),(24)g(an,Ψn(1))=an+an-1-a~^n-1(0).
In this case, notice that Ni=2 and
(25)Ψn(1)∈{ψ(1,0),ψ(1,1),ψ(1,2),ψ(1,3)}
with ψ(1,0)=(+1,+1), ψ(1,1)=(+1,-1), ψ(1,2)=(-1,+1), and ψ(1,3)=(-1,-1). From (20) and (23), we get
(26)Pr{Ψn(1)}=Pr{an-1,a~^n-1(0)}=Pr{a~^n-1(0)∣an-1}Pr{an-1}=12Pr{a~^n-1(0)∣an-1}.
Since
(27)Pr{a~^n-1(0)∣an-1}=Pe(0),a~^n-1(0)≠an-1,
with Pe(0) being the symbol error probability of the first iteration, the probability (26) results
(28)P(ψ(1,0))=P(ψ(1,3))=12(1-Pe(0)),P(ψ(1,1))=P(ψ(1,2))=12Pe(0).
Generalizing, for i>0 it is possible to show that
(29)P(ψ(i,0))=P(ψ(i,3))=12(1-Pe(i-1)),P(ψ(i,1))=P(ψ(i,2))=12Pe(i-1).
On the other hand, taking into account that
(30)g(an,ψ(i,0))=g(an,ψ(i,3))=an,g(an,ψ(i,1))=an+2,g(an,ψ(i,2))=an-2,
it is possible to verify that
(31)fy∣a,Ψ(yn(i)∣an,ψ(i,0))=fy∣a,Ψ(yn(i)∣an,ψ(i,3))=12πσe-(1/2σ2)(yn(i)-an)2,(32)fy∣a,Ψ(yn(i)∣an,ψ(i,1))=12πσe-(1/2σ2)(yn(i)-an-2)2,(33)fy∣a,Ψ(yn(i)∣an,ψ(i,2))=12πσe-(1/2σ2)(yn(i)-an+2)2.
Thus, at high SNR (i.e., 1/σ≫1), from (19)–(33) it is possible to show that
(34)Pe(i)=12[Pr{yn(i)<0∣an=+1}+Pr{yn(i)≥0∣an=-1}]Pe(i)=Q(1σ)+12Pe(i-1)[1-Q(1σ)]+12Pe(i-1)Q(3σ)Pe(i)≈Q(1σ)+12Pe(i-1),
where
(35)Q(x)=12π∫x∞e-t2/2dt.
Operating on the recursive form of the error probability (34), it is simple to verify that
(36)Pe(i)≈2Q(1σ),i≫1.
Since the error probability of the DFE with error propagation is given by [3]
(37)PeDFE≃2LQ(1σ),
from (36) we can conclude that for a number of iterations sufficiently large, the performance of the DFFE in the presence of a duobinary channel (i.e., L=1) is reduced to that achieved by the DFE with error propagation. As we shall show later, the proper number of iterations depends strongly on both the noise power and the channel dispersion. Finally, we realize that the conclusions derived from this example can be extended for channels with memory L>1.
3.2. Simulation Results
A theoretically based estimation of the error probability provides an effective tool for designing the DFFE parameters. The design process is simple and consists of two main steps.
Estimate the number of taps for the feedforward and feedback filters according to the expected channel response (similarly to the design of the DFE).
Estimate the number of the DFFE iterations based on performance evaluation. This task can be also achieved by using computer simulations. As initial point, set R=L+1.
Figure 3 shows the contour of the BER as a function of the SNR and the iteration number. In this case, we use a postcursor ISI channel defined by dk=αk, 0<k≤L with α=0.5, L=6, and R=20. We can observe that the performance of the DFFE for R>6 is similar in all iterations. Therefore, we conclude that DFFE with R=L+1 achieves the same performance as the traditional DFE, as it can be verified from Figure 4. For the DFFE, note the excellent agreement between the values derived from computer simulations and the theoretical prediction given by (16).
BER versus SNR and number of iterations. Postcursor channel with α=0.5/L=6.
Performance of DFFE with R=7 and DFE. Postcursor channel with α=0.5/L=6.
The performance of the DFE and an adaptive DFFE with R=L+1 iterations in the presence of different dispersive channels is evaluated in Figure 5. We consider four channels: α=0.6,0.82,0.92, and 0.95 with L=10, 30, 60, and 100, respectively. The adaptive DFFE has been implemented with the least mean square (LMS) algorithm [3] by using the final decision to estimate the error signal. In all cases, it can be observed that DFFE and DFE achieve essentially the same performance. This result agrees with the theoretical analysis presented in the Appendix, where the impact of imperfect channel estimation on the performance of DFE and DFFE is investigated.
Performance of DFE and adaptive DFFE with R=L+1 for different postcursor ISI channels.
4. Parallel Implementation and Complexity4.1. Parallel-Processing DFFE Architecture
As mentioned in Section 1, the DFFE breaks the bottleneck created by the feedback loop of the DFE using tentative decisions in a feedforward fashion. This enables pipelined implementations which are able to operate at high clock rates. Moreover, parallel processing can be used to further increase the throughput and achievable data rate of the DFFE-based receiver. A P-way parallel implementation is shown in Figure 6. Using this architecture, the data rate and throughput may be increased by a factor P with growth in complexity linear in P.
Parallel DFFE architecture for P=4 and L=3. Blocks DFFEn are as shown in Figure 1.
4.2. Complexity of DFFE
Table 1 shows the numbers of adders, registers, and multiplexers for the DFFE, computed under the following assumptions. The multipliers shown in Figure 1 were considered to be 2-to-1 multiplexers (it is assumed that both the positive and negative values of the coefficients dk are available), which is a correct assumption for binary decisions with values ±1 (e.g., 2-pulse amplitude modulation (PAM) [3]). The number of adders for the DFFE was estimated assuming that the basic building block is a two-input adder.
Complexity of the parallel DFFE architecture for 2-PAM and R>L.
Adders
L(R-L/2-1/2)P
Registers
((R-1)R/2+(R-L)(L+1)L/2+(L2-1)L/6)P
2-to-1 Mux
L(R-L/2-1/2)P
Table 2 presents a comparison of the complexity of the DFFE with the DFE architectures proposed in [4, 7, 9, 10]. The numbers of adders and 2-to-1 multiplexers for the parallel DFE schemes were extracted from [4, 7, 9], while the number of registers was estimated based on their architectures. Figure 7 shows the numbers of the three types of components as functions of the number of feedback taps. The most important difference between the DFFE and the DFE proposals considered is that the former does not use look-ahead techniques or multiplexer loops, and this reduces the implementation complexity. In all the cases, the benefits of the DFFE are evident in the presence of highly dispersive channels (i.e., L≫1). A comparison of the complexity for M-PAM is shown in Table 3. We observe that the DFFE still provides a significant reduction of complexity with respect to the DFE architectures [7, 9]. (In M-PAM, multiplication operations are achieved by using M-1 2-to-1 muxes.) This conclusion can be extended to M-QAM where the complexity of both DFE and DFFE is approximately two times the one obtained with M-PAM.
Complexity comparison between parallel DFFE and DFE architectures for 2-PAM with R=L+1 for L≫1.
Component
Receiver
DFFE (this work)
DFE [4]
DFE [7]
DFE [9]
DFE [10]
Adders
L2P/2
2LP
2L/22P
2LP
2L+1P
Registers
L3P/6
~2LP
~2L/2(P+1)
L2+2LP
(2L+L)P
2-to-1 multiplexers
L2P/2
(2L-1)P
(2L/2-1)2P
2LL(P-L/2+P/L-1)
2LP
Complexity comparison between parallel DFFE and DFE architectures for M-PAM with R=L+1 for L≫1.
Component
Receiver
DFFE (this work)
DFE [7]
DFE [9]
Adders
L2P/2
ML/22P
MLP
Registers
L3P/6
~ML/2(P+1)
L2+MLP
2-to-1 multiplexers
(M-1)L2P/2
(ML/2-1)2P
MLL(P-L/2+P/L-1)
Number of adders, registers, and 2-to-1 multiplexers versus the number of feedback taps L, for the parallel DFFE with R=L+1 and DFE architectures proposed in [4, 7, 9, 10]. Parallelization factor: P=16. Modulation format: 2-PAM.
4.3. VLSI Implementation
We consider an application-specific integrated circuit (ASIC) implementation of the proposed DFFE in a 10Gb/s2-PAM receiver. The DFFE architecture was succesfully synthetized (i.e., no timing issues) by using 28 nm CMOS technology with standard voltage threshold (SVT) transistors for L=5/10/30, P=16 (fclock=625.0MHz), and P=32 (fclock=312.5MHz) with R=L+1 iterations. Multiplication operations were implemented by using 2-to-1 multiplexers. The number of bits of the input samples (Ni) and taps (Nc) has been derived from computer simulations for the different postcursor channels (i.e., L=5/10/30). We used Nc=7 and Ni=7 for L=5 and 10. For L=30, the number of bits of the input samples was increased to Ni=8 (see Figure 8). Adders were implemented with carry propagation, thus Nc+log2(L) bits are required to represent the sample at the slicer input. Finally, the slicer uses the MSB of the input sample to control the muxes in order to select the positive or negative coefficient.
BER versus SNR for the DFFE with L=30, R=31, Ni=7/8 bits, and Nc=7 bits.
Table 4 shows the total number of cells and components normalized to the values of P=16 and L=5. Note that these results agree very well with the expected values derived from the complexity analysis developed in Section 4.2; that is, the complexity increases linearly with the parallelization factor (P) and quadratically with the memory of channel L.
Synthesis results for parallel DFFE architecture for 2-PAM and R=L+1 with 28 nm CMOS technology.
fclock (MHz)
P
L
Number of cells^{†}
Number of components^{†}
625.0
16
5
1.00
1.00
625.0
16
10
5.19
4.18
312.5
32
5
1.96
2.00
312.5
32
10
10.03
9.62
312.5
32
30
180.65
159.00
†The total number of cells and components normalized to the values of P=16 and L=5.
4.4. Analysis of the Critical Path
The speed of the different DFE architectures are related to their critical paths. The existing parallel DFE architectures of [4, 9, 10] are faster than the DFFE. However, they are not considered for a speed comparison as a result of their prohibitive high implementation complexity in the presence of channels with high ISI (L≫1). On the other hand, the critical path of the less complex DFE solution proposed in [7] is given by TDFE-[7]≈(1/(L/2+1))Tadd+log2(M)Tmux for M-PAM, where Tmux and Tadd are the multiplexer and adder delays, respectively. Note that TDFE-[7] is independent of the channel memory L. For example, for 28 nm CMOS technology, Tmux≈0.05ns and Tadd≈0.10ns; therefore, the maximum data rates with P=1 for 2-PAM and 4-PAM are ~17.8 and 18.8 Gb/s, respectively.
The critical path for the DFFE is shown in Figure 1. Notice that the delay of the critical path given by TDFFE≈LTadd+log2(M)Tmux increases linearly with the memory channel. As it is shown in Section 4.3, no timing issues have been observed with L=30 and P=32 for 2-PAM with fclock=312.5MHz by using 28 nm CMOS technology. Thus, the maximum data rates achieved by the DFFE for 2-PAM and 4-PAM are 10 and ~20 Gb/s (since L≫1 and Tmux<Tadd, note that TDFFE is dominated by the term LTadd. Therefore the impact of the increase of the constellation size (2→4) on the critical path will be small), respectively. On the other hand, for L=30 the relative complexity of the DFE [7] with P=1 (∝2ML/2) with respect to the DFFE with P=32 (∝32(M-1)L2) is (a) 2×2(30/2)/(32×302)=2.28 for 2-PAM and (b) 2×4(30/2)/(32×3×302)=2.49×104 for 4-PAM. Therefore, the DFFE is able to provide high data rates (e.g., >10 Gb/s) by using existing CMOS technology with complexity implementation lower than that derived from the less complex parallel DFE proposed in [7].
5. Conclusions
In this paper we have proposed and analyzed the DFFE, a low-complexity iterative equalization architecture for high-speed receivers which uses tentative decisions in a feedforward way to estimate postcursor ISI. This central feature lends itself well to a simple parallel implementation, resulting in a reduction of complexity. Using typical examples, we show that DFFE allows to obtain a similar performance to DFE architecture. Moreover, we have proposed a theoretical approximation to estimate the error probability which allows us to demonstrate that the DFFE reaches the same performance as DFE when the number of iterations increases. These advantages make the DFFE an excellent choice for high-speed receivers required to operate over highly dispersive channels. Furthermore, owing to the DFFE flexibility, the architecture can be combined with traditional linear feedforward equalizers or Viterbi algorithm (VA) [3] to compensate channel impairments in the presence of both pre- and postcursor ISI.
AppendixImpact of Imperfect Channel Estimation
Since the DFFE is an attractive solution in the presence of channels with high ISI (i.e., L≫1), it is possible to show that the impact of an imperfect channel estimation is similar in both equalizers, that is, DFE and DFFE. The received input sample yn can be expressed as
(A.1)yn=an+∑k=1Lan-kdk+zn,
where dk with k=1,…,L is the postcursor ISI tap, an is the transmitted symbol, and zn is white Gaussian noise with power σ2. The signal (A.1) can be rewritten as
(A.2)yn=an+∑k=1Lan-kdk+zn=an+∑k=1Lan-kd^k+∑k=1Lan-kΔk+zn,
where d^k and Δk denote the tap estimated at the receiver and the error estimation, respectively (i.e., dk=d^k+Δk). Since L≫1 and symbols an are assumed independent and identically distributed (iid), from the central limit theorem note that the term
(A.3)rn=∑k=1Lan-kΔk
can be modeled as a zero mean Gaussian random variable with variance σr2. Therefore, the signal at the input of the receiver with imperfect channel estimation can be seen as
(A.4)yn=an+∑k=1Lan-kd^k+z~n,
where
(A.5)z~n=rn+zn
is zero mean Gaussian noise with power σr2+σ2. Thus, from (A.4) and (A.5) we can conclude that the impact of the imperfect channel estimation on the performance of DFE and DFFE will be similar.
Acknowledgments
This paper has been supported in part by the ANPCyT (PICT2008-1256, PRH-203), Fundación Tarpuy, and Fundación Fulgor.
AgazziO. E.HuedaM. R.CrivelliD. E.CarrerH. S.NazemiA.LunaG.RamosF.LopezR.GraceC.KobeissyB.AbidinC.KazemiM.KargarM.MarquezC.RamprasadS.BolloF.PosseV.WangS.AsmanisG.EatonG.SwensonN.LindsayT.VooisP.A 90 nm CMOS DSP MLSD transceiver with integrated AFE for electronic dispersion compensation of multimode optical fibers at 10 Gb/sCrivelliD. E.CarrerH. S.HuedaM. R.Adaptive digital equalization in the presence of chromatic dispersion, PMD, and phase noise in coherent fiber optic systemsProceedings of the IEEE Global Telecommunications Conference (GLOBECOM '04)December 200425452551paper SP08-32-s2.0-18144365095BarryJ. R.LeeE. A.MesserschmittD. G.KasturiaS.WintersJ. H.Techniques for high-speed implementation of nonlinear cancellationParhiK. K.Pipelining in algorithms with quantizer loopsParhiK. K.MesserschmittD. G.Pipeline interleaving and parallelism in recursive digital filters I. Pipelining using scattered look-ahead and decompositionLinC. H.WuA. Y.LiF. M.High-performance VLSI architecture of decision feedback equalizer for gigabit systemsParhiK. K.Design of multigigabit multiplexer-loop-based decision feedback equalizersOhD.ParhiK.Low complexity design of high speed parallel decision feedback equalizersProceedings of the International Conference on Application-specific Systems, Architectures and Processors (ASAP '06)200611812410.1109/ASAP.2006.43LinC. S.LinY. C.JouS. J.ShiouM. T.Concurrent digital adaptive decision feedback equalizer for 10GBase-LX4 ethernet systemProceedings of the IEEE Custom Integrated Circuits Conference (CICC '07)September 20072892922-s2.0-3954910956510.1109/CICC.2007.4405735LinY. C.ShiueM. T.JouS. J.10Gbps decision feedback equalizer with dynamic lookahead decision loopProceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '09)May 2009183918422-s2.0-7035017948510.1109/ISCAS.2009.5118136LinY.JouS.ShiueM.High throughput concurrent lookahead adaptive decision feedback equaliserAndrewsJ.Interference cancellation for cellular systems: a contemporary overviewChanA. M.WornellG. W.A class of block-iterative equalizers for intersymbol interference channels: fixed channel resultsChanA. M.WornellG. W.A new class of efficient Block-Iterative interference cancellation techniques for digital communication receiversLiangY. C.SunS.HoC. K.Block-iterative generalized decision feedback equalizers for large MIMO systems: algorithm design and asymptotic performance analysisCardiffB.GaffneyB.FaganA. D.Multiple decision feedback equalizers for vector systems with complexity/performance tradeoffProceedings of the16th International Conference on Telecommunications (ICT '09)May 20093033082-s2.0-7795063074810.1109/ICTEL.2009.5158663PolaA. L.CrivelliD. E.CousseauJ. E.AgazziO. E.HuedaM. R.A new low complexity iterative equalization architecture for high-speed receivers on highly dispersive channels: decision feedforward equalizer (DFFE)Proceedings of the IEEE International Symposium of Circuits and Systems (ISCAS '11)May 20111331362-s2.0-7996085189810.1109/ISCAS.2011.5937519ChenJ.GuY.ParhiK. K.Novel FEXT cancellation and equalization for high speed ethernet transmissionGeorgeD.BowenR.StoreyJ.An adaptive decision feedback equalizerDuttweilerD. L.MazoJ. E.MesserschmittD. G.An upper bound on the error probability in decision-feedback equalizationO'ReillyJ.de Oliveira DuarteA.Error propagation in decision feedback receiversAlekarS. A.BeaulieuN. C.Upper bounds to the error probability of decision feedback equalizationSmeeJ. E.BeaulieuN. C.Error-rate evaluation of linear equalization and decision feedback equalization with error propagationCoverT. M.ThomasJ. A.