25
1 Computing with Leakage Computing with Leakage Currents Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

1

Computing with Computing with Leakage CurrentsLeakage Currents

Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri

ECE Department

Texas A&M University

Page 2: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

2

OutlineOutline

Sub-threshold circuits – the opportunity Challenges

Process/temperature/voltage variations Energy minimization in sub-threshold circuits Re-claiming the speed penalty

What’s next?

Page 3: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

3

IntroductionIntroduction Power consumption has become a significant

hurdle for recent ICs Higher power consumption leads to

Shorter battery life Higher on-chip temperatures – reduced operating

life of the chip There is a large and growing class of applications There is a large and growing class of applications

where power reduction is paramount – not speed.where power reduction is paramount – not speed. Such applications are ideal candidates for sub-

threshold circuit design. OK, so what is sub-threshold design??

Page 4: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

4

As supply voltage scales down, the VT of the devices is scaled down as well.

A larger VT would reduce leakage but increase delay.

Leakage increases exponentially with decreasing VT

Until a few process generations ago, leakage power was negligible compared to dynamic power But leakage power is now becoming comparable with dynamic power. Ouch

(three times). Can we turn this dilemma into an opportunity ?Can we turn this dilemma into an opportunity ?

Sub-threshold LeakageSub-threshold Leakage

Tgs VV

t

ds

t

offTgs

v

V

nv

VVV

osubds ee

L

WII 1 when

Page 5: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

5

The OpportunityThe Opportunity

Process Delay(ps) Power(W) P-D-P(J) Delay Power P-D-P Delay Power P-D-P bsim70 14.157 4.08E-05 5.82E-07 17.01X 308.82X 18.50X 9.93X 141.10X 14.43X

bsim100 17.118 6.39E-05 1.08E-06 24.60X 497.54X 20.08X 12.00X 100.96X 8.20X

Sub-threshold Ckt (Vb = VDD)Sub-threshold Ckt (Vb = 0V)Traditional Ckt

Compared traditional circuit with sub-threshold (obtained by simply setting VDD < VT)

Performed simulations for 2 different processes on a 21 stage ring oscillator. Impressive power reduction (100X – 500X) Power-Delay-Product (P-D-P) improves by as much as 20X

P-D-P is an important metric to compare circuit design styles Delay penalty of 10X – 25X can be reduced:

By applying forward body bias (dynamic) By reducing VT values (static)

Page 6: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

6

The OpportunityThe Opportunity

VT Delay Power P-D-P VT Delay Power P-D-P

0.18 16.15X 167.52X 10.41X 0.27 23.32X 479.85X 20.60X0.17 14.88X 151.99X 10.09X 0.25 22.43X 464.33X 20.16X0.16 13.78X 137.73X 9.95X 0.23 21.02X 444.23X 20.05X0.15 13.15X 124.59X 8.86X 0.21 18.69X 400.89X 20.27X0.14 12.43X 112.73X 9.40X 0.19 18.42X 366.28X 18.98X0.13 12.32X 101.85X 8.02X 0.17 17.51X 323.26X 17.98X

bsim70 bsim100

We also performed experiments with lower VT values.

VT can be modified with no extra cost

Delays improved, while the PDP improvement remained high.

Page 7: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

7

Sub-threshold LogicSub-threshold Logic Advantages

Circuits get faster at higher temperature. Hence no need for expensive cooling techniques.

Device transconductance is an exponential function of Vgs which results in a high ratio of on versus off current. Hence noise margins are near-ideal.

Note that device is never “on”. It is just “off” or “exponentially more off”, so to say

Disadvantages Ids has an exponential dependence on temperature.

Ids is highly dependent on process variations (such as VT variations).

Ids is small. This explains the delay penalty

t

ds

t

offTgs

v

V

nv

VVV

osubds ee

L

WII 1

Page 8: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

8

Solving the Problem of Solving the Problem of Delay Sensitivity to Delay Sensitivity to

Process, Voltage and Process, Voltage and Temperature VariationsTemperature Variations

Page 9: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

9

Our SolutionOur Solution We propose a technique that uses self-adjusting

body-bias to phase-lock the circuit delay to a beat clock.

Use a network of PLAs to implement circuits. Several PLAs in a cluster share a common Nbulk

node. A representative PLA in each cluster is chosen to

phase lock the delay of the PLAs to the beat clock If the delay is too high, a forward body bias is

applied to speed up the PLA. If the delay is low, the body bias is brought back

down to zero to slow down the PLA.

Page 10: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

10

PLA structurePLA structure We use precharged

NOR-NOR PLAs as the structure of choice.

Wordlines run horizontally.

Inputs (and their complements) and the outputs run vertically.

Several PLAs in a cluster share a common Nbulk node.

Page 11: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

11

The Charge PumpThe Charge Pump

Page 12: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

12

Effectiveness of the Effectiveness of the ApproachApproach

We simulated a single PLA from 0ºC to 100ºC. Also applied VT variations (10%) and VDD variations (10%).

The light region shows the variations on delay over all the corners.

The red region shows The red region shows the delays with the the delays with the self-adjusting body-self-adjusting body-bias circuit.bias circuit.

Page 13: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

13

An Example Showing An Example Showing Phase LockingPhase Locking

This figure shows how the body bias (and hence the delay of the PLA) changes with changes in VDD.

The adjustment is very quick (within a few clock cycles).

VDD change0.2V to 0.22V

VDD change0.22V to 0.18V

Page 14: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

14

What about Energy What about Energy MinimizationMinimization

Minimum Power does not mean Minimum Power does not mean Minimum Energy…Minimum Energy…

We are interested in mimimum We are interested in mimimum energy operation given the energy operation given the

application scenario envisionedapplication scenario envisioned

Page 15: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

15

What about Energy ??What about Energy ??

Minimizing VDD reduces power. But minimum VDD does not mean minimum Energy! There exists an optimum VDD for minimum Energy.

Page 16: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

16

Finding the Optimum Finding the Optimum VDDVDD

While one level of PLAs is Evaluating, the others are Precharged.

The Precharged PLAs are consuming leakage power.

Hence optimum VDD depends on logical depth.

staticstatic

dyndyn

owerEvaluatedPrPchgedPoweDD

DEnergyEvaluatingrgyPchgingEneEnergy

2

1

Page 17: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

17

The Optimum VDDThe Optimum VDD

The optimum VDD value increases with increased logical depth. The optimum VDD can vary with temperature (since the circuits

get faster with temperature). The optimum VDD can be estimated given the logical depth and

delay for each PLA.

25ºC 100ºC

Page 18: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

18

Reclaiming Part of the Reclaiming Part of the Speed PenaltySpeed Penalty

Page 19: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

19

MicropipeliningMicropipelining

For high-speed operation, a network of PLAs can be implemented as an Asynchronous Micropipeline.Asynchronous Micropipeline. P1 triggers a precharge event P2 triggers an evaluate event

Latency increases, but throughput improves Latency increases, but throughput improves dramatically.dramatically.

Handshaking Logic

Page 20: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

20

Micropipelining ResultsMicropipelining Results

Non-μ pipelined μ-pipelined Improvement Non-μ pipelined μ-pipelined ImprovementC432 2665 475 0.18 7392 10080 1.36C499 2665 475 0.18 9408 12096 1.29alu4 3340 475 0.14 9408 12768 1.36

count 1315 475 0.36 3360 4032 1.20rot 3565 475 0.13 12768 21504 1.68

apex6 2890 475 0.16 16128 24192 1.50C1908 4465 475 0.11 16128 24864 1.54c2670 4015 475 0.12 22848 31584 1.38c1355 3790 475 0.13 14112 20832 1.48c3540 8290 475 0.06 45024 75936 1.69c880 2665 475 0.18 10752 14112 1.31pair 5140 475 0.09 43680 67200 1.54

Avg 0.1533 1.4444

Delay (ns) Area (μ2)Ckt

We get an average speedup of 7X over a non-We get an average speedup of 7X over a non-micropipelined design.micropipelined design.

After this, sub-threshold circuits are slower by a factor of 1.5X -3.5X over their traditional (non micropipelined) counterparts

Page 21: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

21

Layout of the PLALayout of the PLA

Each PLA has 16 inputs, 14 outputs and 24 rows (cubes).

Page 22: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

22

Ambient Light Powered Ambient Light Powered ICsICs

The approach lends itself to being powered by energy scavenged from ambient light Early studies show that this is feasible New Cadmium Sulfide/Cadmium Telluride solar

panels achieve 0.09W/cm2. (Silicon panels produce 0.015 W/cm2)

Estimated power consumption for a subthreshold processor of this size is about 10mW.

So the CdS/CdTe panel could power our processor with a 9X safety margin

Challenges include how to store energy (battery? Supercapacitors? MIM capacitors?).

Page 23: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

23

What next?What next? Explore extensions to structured ASIC approaches Fabrication of a subthreshold design (in 2006)

Mixed-signal – with small processor and transceiver on a single die.

Set up a small hardware lab for debug/diagnosis Validate the experiments we discussed

Hope to use this test-chip to validate other ideas as well.

Develop a design methodology for sub-Develop a design methodology for sub-threshold electronics, tuned for widespread use.threshold electronics, tuned for widespread use.

Page 24: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

24

SummarySummary Sub-threshold circuit design is promising due to extreme

low power. The delay phase locking approach helps sub-threshold logic

design overcome the hurdle of sensitivity to PVT variations. This can help achieve a significant yield improvement.

The study on optimum VDD for minimum Energy helps to fix an optimum VDD for a given logical depth.

Micro-pipelining helps bridge the delay gap. Sub-threshold design approaches are appealing for a

widening class of low power or energy applications. Goal : Help bring sub-threshold logic design into the

mainstream of VLSI technology.

Page 25: 1 Computing with Leakage Currents Nikhil Jayakumar, Kanupriya Gulati, Rajesh Garg and Sunil P. Khatri ECE Department Texas A&M University

25

Thank you!!