View
1
Download
0
Category
Preview:
Citation preview
166 • 2010 IEEE International Solid-State Circuits Conference
ISSCC 2010 / SESSION 8 / HIGH-SPEED WIRELINE TRANSCEIVERS / 8.6
8.6 A Fractional-Sampling-Rate ADC-Based CDR with Feedforward Architecture in 65nm CMOS
Oleksiy Tyshchenko1, Ali Sheikholeslami1, Hirotaka Tamura2, Yasumoto Tomita2, Hisakatsu Yamaguchi2, Masaya Kibune2, Takuji Yamamoto2
1University of Toronto, Toronto, Canada2Fujitsu Laboratories, Kawasaki, Japan
ADC-based CDRs take digital samples of the received signal to recover the clockand data. Digital representation of the signal allows for extensive channel equal-ization in the digital domain. Recently-reported ADC-based CDRs sample the sig-nal at 1× or 2× the baud rate. The 1× CDR aligns the sampling clock with the sig-nal using a phase-tracking feedback loop [1-2], which requires a voltage-con-trolled oscillator or phase interpolator, both analog circuits, to adjust the phaseof the sampling clock. To eliminate these analog circuits (and their phase con-trol) in favor of an all-digital implementation, a blind-sampling ADC-based CDR(top of Fig. 8.6.1) samples the received signal at 2× without phase locking to thesignal. The CDR then interpolates between the blind samples to obtain a new setof samples in order to recover the phase and data [3-4]. The doubling of thesampling rate, however, increases the ADC power consumption or, equivalently,reduces the maximum baud rate due to the conversion-rate limitations of ADCs.
This paper presents a new fractional-sampling-rate (FSR) CDR architecture,shown in Fig. 8.6.1, that samples the received signal blindly at a fractional rateof 1.45×, hence reducing the ADC power per Gb/s of data rate by 27.3% com-pared to the 2× architecture. This architecture uses a digital phase detector (PD)that estimates the data phase directly from the blind digital samples, thus elim-inating the need for interpolation. This PD enables data recovery in a feed-for-ward path, further simplifying the CDR architecture. Measurements of a test-chipfabricated in 65nm CMOS confirm that the FSR CDR successfully recovers datawith BER<10-13 at 6.875Gb/s from samples taken at 10GS/s.
A block-diagram of the CDR architecture is shown in Fig. 8.6.2. We blindly sam-ple a 6.875Gb/s signal with four time-interleaved 2.5GS/s 5-bit flash ADCs for atotal sampling rate of 10GS/s, corresponding to 1.45 samples per unit interval(UI). This sampling rate makes the sampling interval (SI) equal to 11/16 UIs,which causes the sampling instances to span the full duration of a UI. A 4:16DeMUX then feeds 16 samples at a time, corresponding to 11 UIs, to the digitalCDR. The PD estimates the instantaneous zero-crossing phase, φX[1:16], forevery UI, using a scheme we describe later. We use φX to recover the averagezero-crossing phase, φAVG, in two steps. First, the phase subtractor generates thephase error, φERR, with a modulo-subtraction of φAVG from φX, bounding φERR
within [-0.5; 0.5) UI. Then, φERR is fed into a third-order low-pass filter to recov-er φAVG. The filter consists of three discrete-time integrators with programmablegains, K1, K2, and K3, that control the CDR’s jitter-tracking bandwidth. The datadecision block picks one sliced sample per UI as the recovered data by compar-ing φX[n] and φAVG, and marks duplicate samples, present in some UIs due to theFSR, as invalid samples. We remove these invalid samples from the data-deci-sion vector, Ŝ[1:16], with a vector compactor (described later), which outputs 11data bits, D[1:11]. For measurement purposes, we retime the recovered datafrom the blind-sampling clock domain to the baud-rate clock domain, fB/16,using a FIFO.
The PD, shown in Fig. 8.6.3, consists of an average-transition-slope calculatorand a data-phase calculator. From the 16 samples at its input, the PD linearlyestimates φX for every pair of adjacent samples with opposite polarities. This lin-ear phase estimation proves sufficient when there is enough ISI in the channel.Otherwise, an anti-aliasing filter needs to precede the sampling ADC. The PDuses the transition slope between the samples to estimate the phase. As shownin Fig. 8.6.3, due to the FSR some slopes lead to small errors in φX, while otherslead to larger errors. To calculate a running average of slopes, we select onlythose slopes that lead to low φX error. When two transitions occur around onesample (top waveform in Fig. 8.6.3) only the transition with the higher slope con-tributes to the average (S[n] to S[n+1]). When such comparison is impossible(bottom waveform in Fig. 8.6.3), a slope contributes to the average only if bothits samples exceed a threshold level, VTH, that is extracted from sample magni-tudes. Since the time between adjacent samples is constant, finding the slopesimplifies to a sum of sample magnitudes; the slope calculator thus outputs(|S[n]|+|S[n+1]|)AVG. The phase calculator, shown in Fig. 8.6.3, estimates thezero-crossing time, φZC[n] (in units of SI), as the ratio |S[n]|/(|S[n]|+|S[n+1]|).To maintain low circuit complexity, the accuracy of φZC[n] is limited to 2 bits. Fortransitions with low-error slopes, we use an instantaneous sum in the 2-bit
φZC[n] calculation, while for transitions with high-error slopes we use the aver-age sum. We then convert φZC[n] from SI to UI using φX[n]=TS[n]+SI·φZC[n],where TS[n] is the time stamp – the sample’s position in UI. Our choice of sam-pling rate causes TS[n] to repeat every 16 samples. Since φZC[n] is only 2-bitaccurate, we convert φZC[n] to φX[n] using a selector with constant inputs, asshown in Fig. 8.6.3.
The data decision block, shown in Fig. 8.6.2, picks one sliced sample per UI bycomparing φX[n], φAVG, and TS[n]. This block also marks the duplicate samplesby setting their valid flags (VF[n]) to ‘0’. To remove these duplicates, whose posi-tions are unknown a priori, we use the vector compactor presented in Fig. 8.6.4.It accepts 16 sliced samples, Ŝ[n], with their VF[n] and produces 11 data bits,D[k], such that every UI corresponds to a single data bit. The compactor con-sists of an array of conditional data selectors, which pass data bits either fromthe left or from top to bottom according to the state of the enable signals. Therows with VF[n]=‘0’ (shaded) pass the data from top to bottom. As a result, theoutput data vector is free of duplicate samples. To reduce area and power, weeliminate the cells that only pass data from top to bottom. With this, the com-pactor reduces to only 33 cells instead of a full 176-cell array, resulting in a sin-gle-cycle compaction.
To experimentally verify the ADC functionality at the FSR, we sampled a6.875Gb/s 27–1 PRBS signal at 10GS/s and captured the DeMUXed ADC sam-ples. We then assigned constant TS[n] to every channel of the DeMUX accord-ing to the channel number, n. Finally, we arranged the samples along the timeaxis in the ascending order of their TS[n]. Figure 8.6.5 presents the resultingmeasured eye diagram reconstructed from 516800 ADC samples. For every ver-tical slice of the eye diagram, we annotate the DeMUX channel number and thecorresponding time stamp. An open eye at the ADC output confirms that the ADCis functional and that error-free data recovery is possible in the FSR CDR.
Figure 8.6.6 presents the simulated and measured jitter tolerances of the FSRCDR. Both simulations and measurements were performed with a 6.875Gb/s27–1 PRBS input and a sampling rate of 10GS/s. We used an event-driven model[5] in Simulink to simulate the CDR. Our simulations show that the CDR toler-ates up to 0.5UIPP of sinusoidal jitter at high frequencies (simulated for 2×105
UIs, no random jitter at TX and RX). To validate our simulated results, we fabri-cated and characterized the FSR CDR in 65nm CMOS. The inset in Fig. 8.6.6presents a measured eye diagram at the receiver input. In addition to the 16ps(0.11UIPP) of jitter already present because of the setup, we applied sinusoidaljitter from 50kHz to 8MHz (the range was limited by the available equipment) tomeasure the jitter tolerance of the CDR. We generated the receiver input with aCentellax PRBS board and recorded the jitter tolerance at BER=10-12. Our meas-urements closely match the simulation results and confirm that the FSR CDR tol-erates 0.3UIPP of high-frequency sinusoidal jitter. The CDR tolerates up to 49MHz(0.98%) of frequency offset (with BER<10-12) between the transmitter and receiv-er beyond the nominal offset due to the FSR.
Figure 8.6.7 shows a die photo of the test-chip. The ADC, DeMUX and clockdivider are analog custom-designed blocks while the CDR and test-structureswere synthesized. The CDR consists of 75644 gates and consumes 58.8mWwhile the ADC consumes 116.4mW. The FSR CDR reduces the ADC power by27.3%, in comparison with a 2× feed-forward architecture, at the cost of dou-bling the gate count; however, the power per Gb/s of data rate and the totalreceiver area are reduced by 12.5%. The receiver occupies 0.3683mm2.
Acknowledgment:We thank Chihiro Sannomiya for assistance with test-chip design and verifica-tion.
References:[1] O. Agazzi et al., “A 90nm CMOS DSP MLSD Transceiver with Integrated AFEfor Electronic Dispersion Compensation of Multi-mode Optical Fibers at 10Gb/s,”ISSCC Dig. of Tech. Papers, pp. 232-233, Feb. 2008.[2] M. Harwood et al., “A 12.5Gb/s SerDes in 65nm CMOS Using a Baud-RateADC with Digital Receiver Equalization and Clock Recovery,” ISSCC Dig. of Tech.Papers, pp. 436-437, Feb. 2007.[3] F.M. Gardner, “Interpolation in digital modems – Part I: Fundamentals,” IEEETran. on Communications, Vol. 41, Issue 3, pp. 501-507, Mar. 1993.[4] M. Spurbeck et al., “Interpolated timing recovery for hard disk drive readchannels,” IEEE Intern. Conf. on Communications, Vol. 3, pp. 1618-1624, Jun.1997.[5] M. van Ierssel et al., “Event-Driven Modeling of CDR Jitter Induced by Power-Supply Noise, Finite Decision-Circuit Bandwidth, and Channel ISI,” IEEE Tran. onCircuits and Systems I, Vol. 55, Issue 5, pp. 1306-1315, Jun. 2008.
978-1-4244-6034-2/10/$26.00 ©2010 IEEE
167DIGEST OF TECHNICAL PAPERS •
ISSCC 2010 / February 9, 2010 / 11:15 AM
Figure 8.6.1: Blind-Sampling ADC-Based CDR Architectures. Figure 8.6.2: Fractional-Sampling-Rate (FSR) CDR Block Diagram.
Figure 8.6.3: Phase-Detector Implementation.
Figure 8.6.5: Measured Eye Diagram at ADC Output (reconstructed from 5x105
ADC samples). Figure 8.6.6: Measured and Simulated Jitter Tolerance.
Figure 8.6.4: Vector-Compactor Implementation.
ADC Interpolator PhaseDetector
LoopFilterRX
ADC PhaseDetector FilterRX
1.45xSampling Clock
InterpolationIndex Updater
X ERR
AVG
AVG
ERR
Previous Work: 2x Interpolating Feedback CDR
DataDecision
RecoveredData
DataDecision
RecoveredData
This Work: Fractional-Sampling-Rate (FSR) Feed-Forward CDR
Interpolated Samples
2xSampling Clock
Feedback
Feed-ForwardEQ
EQ5-bit
2.5GS/sADC
RX6.875 Gb/s
5 GHz 2-phaseSampling Clk
16
PD16
AVG
DataDecision
11FIFO
16
DOUT
Digital CDR
Phase Subtractor
VectorCompactor
2.5 GHz4-phase
S[1:16]
4 ERR
Low-Pass Filter
fB/16Retiming Clk
Low-PassFilter
2
S[1:16] D[1:11]
16
X[1:16]
X[1]
X[2]mod
mod
X[16] mod
1/16
AVG
ERR K1z-1
1 - z-1
K2z-1
1 - z-1
K3z-1
1 - z-1
ERR AVG
Average-transition-slope calculator
Data-phasecalculator 16
X[1:16](|S[n]| + |S[n+1]|)AVG
16
1
Digital Samples
TS[n] + SI·1/8TS[n] + SI·3/8TS[n] + SI·5/8TS[n] + SI·7/8
00011011
X[n], UI
ZC[n], SIS[n]
S[n+1] S[n+2]
S[n-1]VTH
|VTH|
S[n] > VTH AND |S[n+1]| > |VTH|:Include in average slope
5
-2
0S[n+1]
S[n]
ZC[n]
t
00 01 10 11SI = 11/16 UI
S[n]
S[n+1]
S[n+2]S[n-1]
S[n] > S[n+2]:Small X[n] error
Include in average slope
S[n+2] < S[n]:Large X[n+1] error
Exclude from average slope
S[1:16]
01
DT_IN_T EN_IN_T
EN_OUT_R
DT_IN_L
EN_IN_L
DT_OUT_B EN_OUT_B
This block accepts 16 sliced samples, S[n], with their valid flags, VF[n] (on the left), and produces 11 data bits, D[k] (at the bottom).
01
x0
01
11
1 1 1x x x
S[1]VF[1]
S[2]VF[2]
S[3]VF[3]
S[4]VF[4]
1x
1 0 0
11
S[16]VF[16]
D[1] D[3]D[2]1
D[11]
1 4 7 01 13 16 3 6 9 12 15 2 5 8 11 14
0/16 1/16 2/16 3/16 4/16 5/16 6/16 7/16 8/16 9/16 10/16 11/16 12/16 13/16 14/16 15/16
DeMUX Channel, n
Time Stamp, TS[n]
AD
C S
ampl
e V
alue
2
30
26
22
18
14
10
6
XTime, UI
0 0.2 0.4 0.6 0.8 1
1030.1
1
10
100
1000
Jitter Frequency, Hz
Jitte
r Am
plitu
de, U
I PP
104 105 106 107 108
Sample Input Eye Diagram (no sinusoidal jitter)
SimulatedMeasured
500mV0.11UI16 ps
8
• 2010 IEEE International Solid-State Circuits Conference 978-1-4244-6034-2/10/$26.00 ©2010 IEEE
ISSCC 2010 PAPER CONTINUATIONS
Figure 8.6.7: Die Photograph.
1900m
Data Rate 6.875 Gb/sSupply 1.2 V
ADC Power 116.4 mW
Process 65 nm CMOS
Digital Power 58.8 mW
Output B
uffers
4 channels2.5GS/s
FlashADCs400x490 m2
4:16DeMUX60x490 m2
Input Buffers50x60 m2
CDR430x270 m2
SynthesizedLogic
TestStructures
Bias Gen. &Clock Div.
170x140 m2
Receiver Area 0.3683 mm2
Recommended