21
1 Foxton Technology HotChips 2005 Sam Naffziger Intel Corp.

Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

Embed Size (px)

Citation preview

Page 1: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

1

Foxton Technology

HotChips 2005

Sam Naffziger

Intel Corp.

Page 2: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

2

The Power Problem� Power consumption is a primary limiter in

today�s processors and unfortunately, it

varies a lot

� Part to part

(processing)

� As a result of

the application

� Due to

temperature80

90

100

110

120

Watt

s

3500 3700 3900 4100 4300 4500 4700

SORTOSC14

Y 1.8G FP(peak) Power @1.2V 2.0G FP(peak) Power @1.2V

Measured data of

multiple Montecito parts:

Power vs. part speed

Page 3: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

3

50.00

60.00

70.00

80.00

90.00

100.00

110.00

120.00

130.00

int.1

81.m

cf

int.1

75.v

pr

int.1

86.c

rafty

int.3

00.tw

olf

int.2

54.g

ap

int.1

97.p

arse

r

int.1

64.g

zip

int.2

56.b

zip2

int.2

53.p

erlb

mk

int.1

76.g

cc

int.2

55.v

orte

x

int.2

52.e

on

fp.1

77.m

esa

fp.1

68.wup

wise

fp.1

88.am

mp

fp.1

87.fa

cere

c

fp.3

01.ap

si

fp.1

91.fm

a3d

fp.1

89.luca

s

fp.1

79.ar

t

fp.1

83.eq

uake

fp.1

73.ap

plu

fp.2

00.sixtra

ck

fp.1

72.m

grid

fp.1

71.sw

im

fp.1

78.ga

lgel

Patho

logi

cal V

irus

Application

Po

we

r

2.0G Power Average

2.0G Power Peak

Measure Montecito

core power vs.

application

FP code

Integer code

The Power Problem� Power consumption is a primary limiter in

today�s processors and unfortunately, it

varies a lot

� Part to part

(processing)

� As a result of

the application

� Due to

temperature

Page 4: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

4

Current Approaches to Power

Management and Reduction

� Split out the thermal power spec from the max electrical power spec� Use �Thermal Design Power� (TDP) to spec a sustained power

that is lower than the true maximum (�electrical power�)

� Counts on the rarity of very high power events

� Relies on a thermal sensor to throttle the part if it�s too hot

� Allows a lower cost thermal solution, but power supplies and power delivery must still handle the max electrical power

� Dynamic Voltage Scaling (C states/P states)� Conserve energy when the processor is under-utilized to reduce

average power

� Fuse in a Vcc that is part-specific� Higher power but faster parts can use a lower voltage at the

same frequency

Page 5: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

5

The Ideal Power Management for

Servers and Desktop

� We currently over-design our power supplies and thermal solutions for worst case parts and applications

� Most of the time the part isn�t fully using the watts we�ve allocated for it� Lower power applications only run as fast as the

highest power ones

!We want to maximize performance / Watt for all situations

!We want a processor to adapt operating point dynamically to it�s situation

This is what Foxton Technology does

Page 6: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

6

What is Foxton?

� An integrated system that dynamically

maximizes performance per watt including

� Accurate, integrated power measurement

� Integrated temperature measurement

� Frequency control to maximize hertz/volt

� A microcontroller to incorporate instantaneous

{power, temperature, voltage, frequency} and

optimize the operating point

� The result is processor cores doing their

computation at optimal power efficiency

Page 7: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

7

High Level View of System

MicroMicro--ControllerController

PowerPower

SensorSensorSupplySupply

VRMVRM

10s of µs

100s of ps

ThermalThermal

SensorSensor

VoltageVoltage

SensorSensorVoltage to Freq.Voltage to Freq.

Converter Converter ClockClock

Page 8: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

8

Power Consumption Contour

0.40.5

0.60.7

0.80.9

11.6

1.8

22.2

2.4

0

20

40

60

80

100

120

140

160

180

200

Power

(Watts)

Activity Factor

GHz= F

(V)

Optimization point is for

typical integerapplications which have

.6X the switching power

of the worst case

! Amdahl�s law

Manufacturing test is

accomplished by

observing the self measured power, and

the self-generated frequency for typical

code at the power limit

Page 9: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

9

Measuring Power

� Use package resistance to measure power

� Avoids burning extra power in measurement

� Portable, self-contained solution� No dependence on external power supply

VConnectorVDie

RPackage

Page 10: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

10

VID

6

Power Control System

+-

DAC+-

PLimit

VDie

IIR

PCalc Calc

VConnectorA/D

A/D

Power SupplyMicro-Controller

RPackage

Package/Die

Page 11: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

11

Temperature

Measurement

� Calibrate the voltage drop at test to TJ target (90C)

� Use the known -1.7mV per degree C temperature coefficient to calculate die temperature

� Measure the voltage drop across the diodes every 20ms

Power

Supply

A/D

Converters

Micro

Controller

Die

VFixed

VThermal

6

VID

Page 12: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

12

Package Resistance Calibration

� Package resistance can be computed with two voltage measurements with processor stalled� Pulling quiescent current I0� Pulling I0 + a precision, on-die generated current IDelta

� On-package precision R for consistent IDelta

� 66ms recalibration rate

I0 I0+IDelta

Vc2-Vd2

Vc1-Vd1

R Package

( ) ( )Delta

dcdcPackage

I

VVVVR 1122

−−−=

Page 13: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

13

Frequency vs. Power Limit

Measured DataCore 0, Core 1, Avg Frequency vs. PLimit

84%

100%

100W 95W 90W 85W 80W 75W 70W 65W 60WPLimit

Fre

qu

en

cy

Core 0 Frequency Core 1 Frequency Avg Frequency

88%

92%

96%

A 31% power

redux for a 10%

frequency hit

Page 14: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

14

Managing Frequency

� Voltage variability costs frequency and hence performance/watt

� A clock system that can track rapid voltage changes will both maximize hertz/volt and provide smooth response to micro-controller induced voltage changes

Vcc(t)

F(V(t))

Favg(t)

Fmin(t)Old FMAX

New

FMAX

Today: Minimum Vcc(t) determines maximum

frequency.

Foxton: Average Vcc(t) determines average

frequency.

Page 15: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

15

A Variable Frequency Clock System

PLL

Bus Clock(200MHz)

(M=10)

Fmax (2.0GHz)

1/M

÷2

DFD

Foxton

µC

DFD

I/Os

DFD

Bus

Logic

DFD

Core Logic

RVD(+/-)

(D=16)

(D=0)D=f(P,V,T)

(.504*Fmax ≤ Fcore ≤ Fmax)

(1.6GHz)

(1.6GHz)

(1.0GHz)

(D=16)

Icore

MVRVID

VCORE

(dynamic)VID=f(Power,T)

VFC

L0

Rptr

SLCB

Page 16: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

16

Montecito Clock System Floorplan

PLL / PLL /

translation translation

table / table /

clock clock

controlcontrol

FSB FSB

DFDsDFDs

Foxton Controller DFDFoxton Controller DFD

FSB FSB

DFDsDFDs

Core Core

DFDsDFDs

RVDsRVDs

CORE 1CORE 1

CORE 0CORE 0

Bus Logic DFDBus Logic DFD

Page 17: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

17

Clock System Modes

� Fixed Frequency (FFM)� Cores/Uncore are frequency and phase aligned

� Cores/Uncore interfaces synchronous

� Variable Frequency (VFM)� Core supply modulated by Foxton Controller to

manage power envelope

� Core frequencies track Vcore via Regional Voltage Detector (RVD) V-F curves

� Respond to Foxton modulation and local transients

� V-F curves match worst-scaling paths on chip

� Core/Uncore interfaces asynchronous

Page 18: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

18

RVD Delay Line

FINE

inR R

COARSE

R RRR RR R

0 1 2 3 4 5 6 7

Dynamic Mux

odd

coarse_sel

F FF F F F F F F

FET

0

1

inout

2.00E-10

4.00E-10

6.00E-10

8.00E-10

1.00E-09

1.20E-09

1.40E-09

1.60E-09

0.60 0.70 0.80 0.90 1.00 1.10 1.20 1.30 1.40

RVD Delay Element

metal1 resistor

Page 19: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

19

Example VFC Supply Droop Response

DFD

Output

Clock

Vcore

RVD

Delay

Line

Clock

Period

�UP� to

DFD

Clock period increased

No Adjust needed this cycle

Droop increases RVD delay line delay

Increased delay asserts period �UP� for one cycle

1 2 3 4 5

Page 20: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

20

Speed gains from Adaptive Frequency

Speed Gain From VFC

0.0%

1.0%

2.0%

3.0%

4.0%

5.0%

6.0%

7.0%

Test 1 Test 2 Test 3

Page 21: Foxton - Hot Chips · 3 50.00 60.00 70.00 80.00 90.00 100.00 110.00 120.00 130.00 8 1 rf a f y n . o f 5 4 p n .. a rs ze r ip n 2. z i 2 i5 3 r l mk i 7 g cc t 2 v t x i n 2 5 n

21

Summary

� Foxton is a system comprised of several key components� Accurate power and temperature measurement

� Fine grained voltage control

� Dynamic fast-response frequency control

� A micro-controller to manage the system

� It can be wrapped around any processor or ASIC which can be virtually unchanged except:� An asynchronous interface to the rest of the system

� Must support a wider range of operating voltages

� The result is a self-optimizing chip dynamically delivering greatly improved performance/watt