A Wide Lock-Range Referenceless CDR with Automatic Frequency Acquisition

Performance of a digital system is determined by the data rate of interchip communication as well as on-chip operating speed. As the development in process technology has successfully driven ever-increasing on-chip operating frequency, the off-chip interface is becoming the bottleneck in further improvement of system performance. For high-speed chipto-chip communication, serial link protocol has been widely adopted in various computer-to-peripheral interfaces and has achieved data rates of over 10 Gb/s using differential signaling through a well-defined optical channel [1]. The widespread use of serial links for multipurpose, however, still presents some challenges which must be overcome by circuit design. For wide-range CDR, two kinds of circuit schemes have been researched. One is the multirate CDR circuit with multiple reference clocks [2] or single reference clock with a programmable divider [3] or without reference clock [4, 5]. The other is the continuous-rate CDR circuit with fractionalN divider [6] or without an external reference clock [7–10]. The latter CDR scheme detects a change in the bit rate of the incoming data and adaptively controls the internal widerange VCO to track the bit rate without harmonic-lock issue. To extract the data frequency directly from an input data stream, several techniques have been presented with complicated state-machine-based frequency detectors [7–9] or using limited run length of 8B10B coding [10, 11]. The previous frequency acquisition circuits, however, cause large power consumption and area overhead. Therefore an efficient frequency acquisition algorithm is required to reduce complexity and power consumption. This paper presents a 650 Mb/s to 8 Gb/s referenceless CDR with an automatic tracking of data rate [12]. With a novel DLL-based frequency acquisition, the proposed dualloop CDR shows the highest performance in lock-range, power consumption, and size compared with previously reported continuous-rate CDRs.


Introduction
Performance of a digital system is determined by the data rate of interchip communication as well as on-chip operating speed.As the development in process technology has successfully driven ever-increasing on-chip operating frequency, the off-chip interface is becoming the bottleneck in further improvement of system performance.For high-speed chipto-chip communication, serial link protocol has been widely adopted in various computer-to-peripheral interfaces and has achieved data rates of over 10 Gb/s using differential signaling through a well-defined optical channel [1].The widespread use of serial links for multipurpose, however, still presents some challenges which must be overcome by circuit design.
For wide-range CDR, two kinds of circuit schemes have been researched.One is the multirate CDR circuit with multiple reference clocks [2] or single reference clock with a programmable divider [3] or without reference clock [4,5].The other is the continuous-rate CDR circuit with fractional-N divider [6] or without an external reference clock [7][8][9][10].The latter CDR scheme detects a change in the bit rate of the incoming data and adaptively controls the internal widerange VCO to track the bit rate without harmonic-lock issue.To extract the data frequency directly from an input data stream, several techniques have been presented with compli-cated state-machine-based frequency detectors [7][8][9] or using limited run length of 8B10B coding [10,11].The previous frequency acquisition circuits, however, cause large power consumption and area overhead.Therefore an efficient frequency acquisition algorithm is required to reduce complexity and power consumption.
This paper presents a 650 Mb/s to 8 Gb/s referenceless CDR with an automatic tracking of data rate [12].With a novel DLL-based frequency acquisition, the proposed dualloop CDR shows the highest performance in lock-range, power consumption, and size compared with previously reported continuous-rate CDRs.

Circuit Description
Figure 1 shows the proposed CDR which consists of a DLLbased frequency acquisition loop and a PLL-based loop for the clock and data recovery.In the frequency loop, the voltage-controlled delay line (VCDL) is automatically biased so that the delay of VCDL, T, would be equal to one bit duration, Tb.This frequency loop performs a two-step acquisition procedure which is a coarse lock with the coarse delay tracking (CDT) followed by a fine lock with the fine delay tracking (FDT).When CDT ends, FDT loop is enabled with the phase loop.A loss of lock detector (LLD) is included in the frequency loop to monitor a change in the data rate during the fine lock state.If LLD detects a change in the data rate, Reset signal is generated and it forces to restart from the coarse frequency lock again for automatic frequency acquisition.In the phase loop, a quarter-rate binary phase detector (PD) [13] is used with an 8-phase VCO.Since matching between VCDL and VCO is an important assumption in the proposed frequency acquisition, identical delay circuit is used for both VCDL and VCO.The delay cell has hi-gain and lo-gain paths for the frequency lock and the phase lock, respectively.designed to be 3.5 GHz/V for wide lock-range while the logain path is 150 MHz/V for better jitter performance.One delay stage is implemented by the cascade of three delay cells.But in actual layout placement, the total 24 delay cells for VCO and VCDL are alternated for improved matching.

Coarse Frequency Tracking
Loop. Figure 3 illustrates the operation of CDT and how the delay of VCDL can be set to Tb, which is performed by successful phase detection from a random NRZ bit stream.Before the coarse lock operation, the loop filter (LF) is initially charged to VDD so that VCDL would experience the minimum delay.When the coarse lock is started, the first coming rising edge of the input data D initiates the phase detection between the rising edges of D and the inverse of the delayed input, Dd.The phase detection can be performed by a typical phase/frequency detector (PFD).Since the PFD is a sequential logic based on flip-flops, the initial value of PFD determines the pulse width of up/dn signal.So, the initialization by the first coming rising edge of D makes the desired initial value for the coarse operation.Since the initial delay of VCDL is set to be the minimum, PFD generates more DN pulse in the beginning.Then a pulldown current source decreases VCH until the polarity of PFD output changes to UP.Once the UP pulse becomes greater than DN pulse, the output of the polarity checker, Pol, is latched to low.It stops discharging VCH, which is the end of the coarse lock.This coarse delay tracking is performed through a hi-gain path, while lo-gain bias is fixed to the center of the control voltage range.

Fine Frequency and Phase Tracking
Loop.After the coarse lock, FDT takes over the frequency loop.As shown in Figure 4, the rising edges of D and Dd generate autopulses on A and B with a pulse width of 5/6Tb which is implemented by five-stage replica delay cells.An AND gate is used to generate a window signal, Wdw, to select appropriate rising edges for the phase detection.Dp and Dpd are delayed signals of D and Dd, respectively, to make sure the rising edges be in the middle of Wdw when locked.With the replica delay cells, it is guaranteed that the rising edges of Dp and Dpd are placed in the middle of Wdw pulse regardless of the change in bit rate.By accepting the rising edges of Dp and Dpd only when Wdw is high, valid phase detection is achieved with a simple binary phase detector, and the output of PD drives a charge pump to perform the fine lock.Wdw, Dp, and Dpd are also applied to LLD to detect a change of bit rate during the fine lock state.If LLD detects a change, the coarse lock procedure starts again for automatic tracking.If both rising edges of Dp and Dpd are not in the Wdw pulse, it represents the loss of lock and reset signal is generated.Figure 5 shows the two cases of the loss of lock condition.Figure 6 shows the simulated VCH when loss of lock occurs.When the input data rate is changed, the frequency loop detects it and automatically tracks the new data rate by starting the frequency acquisition again.

Mesurements
For verification, the proposed CDR circuit was implemented with a 65 nm CMOS technology as shown in the Figure 7. Active area was 0.108 mm 2 including LF capacitors of 80 pF in frequency tracking loop and 200 pF in phase tracking loop.With a BER of less than 10 −12 , CDR operates at a lock range from 650 Mb/s to 8 Gb/s. Figure 8 shows measured eye diagrams of the quarter-rate recovered data and clock at different data rate.
Figure 9 shows measured quarter-rate recovered clock jitter.It was measured to 9.7 ps rms and 53.3 ps p-p at the data rate of 8 Gb/s.The measured jitter can be decomposed into a pattern-dependent deterministic jitter of 20 ps p-p and a random jitter of 2.8 ps rms , respectively.As shown in Figure 10, the CDR also passed the OC-48 jitter tolerance specification at 2.5 Gb/s.
The CDR consumes power of 20.6 mW and 88.6 mW at 650 Mb/s and 8 Gb/s, respectively.Table 1 summarizes the performance of designed CDR.The proposed DLL-based frequency acquisition scheme achieved an efficient circuit implementation and shows suitability for the low-power and wide lock-range referenceless CDR.

Conclusion
A wide lock-range of 650 Mb/s-to-8 Gb/s referenceless CDR circuit is proposed with an automatic tracking of data rate.For an efficient frequency acquisition in case of continuous data rate changes, a DLL-based loop is used with a simple phase/frequency detector.The CDR, implemented in a 65 nm CMOS, shows a BER of less than 10 −12 with the best performance in lock-range, power consumption.The proposed DLL-based frequency acquisition scheme achieved
Figure 2 shows the circuit diagram of the delay cell.Hi pbias and Lo pbias are PMOS gate bias voltages generated by current-mirrored transformations from Hi nbias and Lo nbias, respectively.With the control range of from 0.5 V to 1.1 V, the gain of the hi-gain path is Coarse delay tracking (CDT)

Figure 7 :
Figure 7: Photomicrograph and layout of the test chip.

Table 1 :
Performance summary.simplified circuit realization and shows suitability for the low-power and wide lock-range referenceless CDR. a