11

Click here to load reader

Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

Embed Size (px)

Citation preview

Page 1: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

1

1

CMOS Technology

for Computer Architects

Lecture 4: Static Complimentary CMOS

Iakovos Mavroidis

Giorgos Passas

Manolis Katevenis

FORTH-ICS (University of Crete)

2 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Recap: Propagation delay

tHL = = CL ΔV

ΙAV

Based on previous calculations:

2 L CL VDD

k’n Wn (VDD - VTn )2

When VDD >> VTn

tHL ~ ~ ~ 2 L CL

k’n Wn VDD Linear dependence on L/W,

CL and 1/ VDD

tLH ~ ~ ~

2 L CL

k’p Wp VDD

VT impacts time

tLH = tHL Wp Wp ≈ 3 Wn

~ ~ ~ k’n

k’p

Wn

Same result for good

noise margins

3 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Recap: RC Model for Inverter (Self Load)

VDD

Rn

Vout

CL

Vin

Vout(t) = (1 – e –t/RC) V

T50% = 0.69 Rn CL

T90% = 2.2 Rn CL

Rn = R0 / W

CL = Cint(min) W + Cext

T50% = 0.69 Rmin Cint (min) + Rmin Cext / W When W >> 1 time is almost

constant but Cgate becomes large T50% ≈ 0.69 Rmin Cint(min)

4 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Recap: Cycle Time vs FO4

FO4 for IBM90nm ≈ 40ps (2002)

FO4 for 22nm ≈ 7ps (2011)

2.5GHz = 400ps Clock Cycle

Clock Cycle in 2002 ≈ 10 FO4

Clock Cycle in 2011 ≈ 57 FO4 (5+ times more)!

Less pipeline stages? More complex stages?

Less Layout Engineers? Fewer timing violations? Rely on slower CAD tools?

Decrease Voltage? Increase threshold voltage?

Page 2: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

2

5 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Recap: Two Inverters Layout

6 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Lecture Contents

Static CMOS Logic

PUN & PDN

Transistor Sizing

Logical Effort

Delay of a single gate

7 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Combinational vs. Sequential Logic

Combinational Sequential

Output = f ( In ) Output = f ( In, Previous In )

CombinationalLogicCircuit

OutInCombinational

LogicCircuit

OutIn

State

8 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Static Logic and Complementary CMOS

At every point in time (except during the switching transients) each gate output is connected to either VDD or VSS via a low-resistive path

This is in contrast to dynamic circuit style

The outputs of the gates assume at all times the value of the Boolean function implemented by the circuit

Complementary CMOS gates

High noise margins

No static power consumption

Comparable rise and fall times

Page 3: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

3

9 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Static Complementary CMOS

PUN and PDN are dual logic networks

PUN and PDN functions are complementary

Required transistors = 2N

VDD

F(In1,In2,…InN)

In1

In2

InN

In1

In2

InN

PUN

PDN

PMOS only

pull-up: make a connection from VDD to F

when F(In1,In2,…InN) = 1

NMOS only

pull-down: make a connection from F to

GND when F(In1,In2,…InN) = 0

10 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Threshold Drops

VDD

VDD PDN

0

CL

CL

PUN

VDD

0

CL

VDD

VDD

VDD

CL

11 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Threshold Drops

VDD

VDD 0 PDN

0 VDD

CL

CL

PUN

VDD

0 VDD - VTn

CL

VDD

VDD

VDD |VTp|

CL

S

D S

D

VGS

S

S D

D

VGS

12 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

NMOS Transistors in Series/Parallel Connection

Transistors can be thought as a switch

NMOS switch closes when switch control input is high

A

B

A • B

A B

A + B

AND: Y = X if A AND B

OR: Y = X if A OR B

NMOS pass a “strong” 0 but a “weak” 1

A

B

Y

X

A B

Y

X

Page 4: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

4

13 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

PMOS Transistors in Series/Parallel Connection

PMOS switch closes when switch control input is low

A B

A +B

NOR: Y = X if A AND B = A + B

NAND: Y = X if A OR B = A B

PMOS pass a “strong” 1 but a “weak” 0

A

B

Y

X

A B

Y

X

A

B A B

Y

14 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Dual PUN and PDN

PUP is the dual of PDN

(can be shown using DeMorgan’s Theorem’s)

AB = A + B serial PDN = parallel PUN

A+B = A B parallel PDN = serial PUN

Static CMOS gates are always inverting

AND = NAND + INV

15 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

CMOS NAND

16 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

CMOS NOR

Page 5: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

5

17 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Building CMOS Gate

Construct PDN format F = G(In1, In2, in3, …)

Example F = A + BC

A + BC = ((A+BC)’)’ = (A’(BC)’)’ = (A’(B’ + C’))’

i.e. G = A’(B’+C’)

A

C B

Construct dual PUN

18 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Complex CMOS Gate

OUT = D + A • (B + C)

D

A

B C

19 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Complex CMOS Gate

OUT = D + A • (B + C)

D

A

B C

D

A

B

C

20 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Stick Diagrams

Contains no dimensions Represents relative positions of transistors

In

Out

V DD

GND

Inverter

A

Out

V DD

GND B

NAND2

Page 6: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

6

21 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Consistent Euler Path

C

A B

X = C • (A + B)

B

A

C

i

j

j

VDD X

X

i

GND

A B

C

PUN

PDN

A B C

Logic Graph

22 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

OAI22 Logic Graph

C

A B

X = !((A+B)•(C+D))

B

A

D

VDD X

X

GND

A B

C

PUN

PDN

C

D

D

A B C D

23 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

OAI22 Layout

B A D

VDD

GND

C

X

Some functions have no consistent Euler path like x = !(a + bc + de) (but x = !(bc + a + de) does!)

24 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Delay Dependence on Input Pattern

A=B=10

A=1, B=10

A=10, B=1

time (psec)

Voltag

e(V

)

NMOS = 0.5m/0.25 m, PMOS = 0.75m/0.25 m, CL = 10 fF

A

B

A B

CL

Input Data Pattern

Delay (psec)

A=B=10 35

A=1, B=10 76

A= 10, B=1 57

Page 7: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

7

25 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Transistor Sizing

B

A

A B

F

VDDVDD

A B

A

B

F

VDD

A

A

F

1

2 2 2

2

2

1 1

4

4

Inverter 2-input NAND 2-input NOR

Rn

Rp

Rn = Rp

Rn/2

Rn/2

Rp Rp Rp / 2

Rp / 2

Rn Rn

26 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Transistor Sizing a Complex CMOS Gate

OUT = !(D + A • (B + C))

D

A

B C

D

A

B

C

27 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Transistor Sizing a Complex CMOS Gate

OUT = !(D + A • (B + C))

D

A

B C

D

A

B

C

1

2

2 2

2

2

4

4

6

6

12

12

28 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

tp as a Function of Fan-In and Fan-Out

Complimentary CMOS gates are not so fast:

Fan-in: quadratic due to increasing resistance and capacitance

Fan-out: each additional fan-out gate adds two gate capacitances to CL

tp = a1FI + a2FI2 + a3FO

Gates with FI ≥ 4 must be avoided

Page 8: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

8

29 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 1

Transistor ordering

C2

C1 In1

In2

In3

M1

M2

M3 CL

charged 1

0 charged

charged 1

delay determined by time to

discharge CL, C1 and CL

C2

C1 In3

In2

In1

M1

M2

M3 CL

critical path

delay determined by time to

discharge CL

1

1

01 charged

discharged

discharged

critical path

1

30 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 2

Transistor sizing

as long as fan-out capacitance dominates

Progressive sizing

InN CL

C3

C2

C1 In1

In2

In3

M1

M2

M3

MN

Distributed RC line (Elmore delay):

tpHL = 0.69 (R1C1+(R1+R2)C2+(R1+R2+R3)C3+… )

M1 > M2 > M3 > … > MN

Can reduce delay by more than 20%

31 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 2

R1 < R2 < R3

M1

M2

M4

However more area

and more intrinsic

capacitance

Good technique if output capacitance is high

32 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 3

Reducing the voltage swing (not CMOS)

linear reduction in delay

also reduces power consumption

But the following gate is much slower!

Or requires use of “sense amplifiers” on the receiving end to restore the signal level (memory design)

tpHL = 0.69 (CL VDD/2)/ Iav )

= 0.69 (CL Vswing/2)/ Iav )

Page 9: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

9

33 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 4

Isolating fan-in from fan-out using buffer insertion

CL CL

It is better to drive a big capacitive load directly with the NAND

gate, or after some buffering ? see Logical Effort (next)

34 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Fast Complex Gates: Design Technique 5

Alternative logic structures

F = ABCDEFGH

Which one is the best? see Logical Effort (next)

35 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Delay Metric: “Intrinsic delay” of Inverter

VDD

Rn Cout= Cint = 3 W Cdunit

(Cext= 0)

Vin = V DD

τ = k Rinv Cdinv

τ = k (Runit /W) (3 W Cdunit )

τ = 3k Runit Cdunit 1W

2W

Propagation delay of inverter ignoring external load

36 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Delay of NAND Gate

τ = 3k Runit Cdunit

Cdnand = 6 W Cdunit = 2 Cdinv

Cgnand = 4 W Cgunit = (4/3) Cginv

Cgunit ≈ Cdunit

h = CL / Cgnand

tpnand = k R (Cdnand + CL)

= k (Runit / W) (6 W Cdunit + CL)

= k (Runit Cgnand/W) (6 W Cdunit / Cgnand + CL/Cgnand)

= k (Runit 4 Cgunit) (6 W Cdunit / 4 W Cgunit + f)

= 3 k (Runit Cgunit) (2 Cdunit / Cgunit + 4/3 f)

= 3 k (Runit Cdunit) (2 + 4/3 f Cgunit / Cdunit )

= τ (2 + 4/3 h)

A

B

A B 2W 2W

2W

2W

CL

Page 10: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

10

37 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Delay of NOR Gate

τ = 3k Runit Cdunit

Cdnor = 6 W Cdunit = 2 Cdinv

Cgnor = 5 W Cgunit = (5/3) Cginv

Cgunit ≈ Cdunit

h = CL / Cgnor

tpnor = k R (Cdnor + CL)

= k (Runit / W) (2 W Cdinv + CL)

= k (Runit Cgnor/W) (6 W Cdinv / Cgnor + CL/Cgnor)

= k (Runit 5 Cgunit) (6 W Cdunit / 5 W Cgunit + f)

= 3 k (Runit Cgunit) (2 Cdunit / Cgunit + 5/3 f)

= 3 k (Runit Cdunit) (2 + 5/3 f Cgunit / Cdunit )

= τ (2 + 5/3 h)

1W CL

A + B A

B

A B 1W

4W

4W

38 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Logical Effort

tpgate = τ (p + g . h)

Measure everything in units of τ (divide by τ):

tpgate = p + g . h

p – intrinsic delay - gate parameter f(W)

g – logical effort - gate parameter f(W)

h – effective fanout

Normalize everything to an inverter:

ginv =1, pinv = 1

You can find this formula also as tpgate = p + LE . FO

39 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Delay in a Logic Gate

Gate delay:

d = p + f

effort delay intrinsic delay

Effort delay:

f = g h

logical effort electrical effort (effective fanout) = Cout/Cin

Logical effort is a function of topology, independent of sizing

Electrical effort (effective fanout) is a function of load/gate size

40 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Logical Effort (g)

Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates

Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current

Logical effort increases with the gate complexity

A

B

A B 2 2

2

2 1

A

B

A B 1

4

4

A 1

2 A

Page 11: Recap: Propagation delay CMOS Technology for Computer ...users.ics.forth.gr/~jacob/cmos4arch/handouts/lec4_4pp.pdf · CMOS Technology for Computer Architects 4 ... Spring 2012 –Lecture

11

41 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Calculating Logical Effort

g = 1 g = 4/3 g = 5/3

Cinput

42 CMOS Technology for Computer Architects

Spring 2012 – Lecture 4

Logical Effort of Gates

Electrical effort (h)

No

rma

lize

d d

ela

y (τ)

1 2 3 4 5 6 7

tpnand

gnand = 4/3

pnand = 2

dnand = (4/3)h+2

tpnor

gnor = 5/3

pnor = 2

dnor = (5/3)h+2

NOR is “slower” than NAND

tpinv

Intrinsic Delay

ginv = 1

pinv = 1

dinv = h+1

Effort Delay