Upload
dangkhanh
View
227
Download
0
Embed Size (px)
Citation preview
1
Foxton Technology
HotChips 2005
Sam Naffziger
Intel Corp.
2
The Power Problem� Power consumption is a primary limiter in
today�s processors and unfortunately, it
varies a lot
� Part to part
(processing)
� As a result of
the application
� Due to
temperature80
90
100
110
120
Watt
s
3500 3700 3900 4100 4300 4500 4700
SORTOSC14
Y 1.8G FP(peak) Power @1.2V 2.0G FP(peak) Power @1.2V
Measured data of
multiple Montecito parts:
Power vs. part speed
3
50.00
60.00
70.00
80.00
90.00
100.00
110.00
120.00
130.00
int.1
81.m
cf
int.1
75.v
pr
int.1
86.c
rafty
int.3
00.tw
olf
int.2
54.g
ap
int.1
97.p
arse
r
int.1
64.g
zip
int.2
56.b
zip2
int.2
53.p
erlb
mk
int.1
76.g
cc
int.2
55.v
orte
x
int.2
52.e
on
fp.1
77.m
esa
fp.1
68.wup
wise
fp.1
88.am
mp
fp.1
87.fa
cere
c
fp.3
01.ap
si
fp.1
91.fm
a3d
fp.1
89.luca
s
fp.1
79.ar
t
fp.1
83.eq
uake
fp.1
73.ap
plu
fp.2
00.sixtra
ck
fp.1
72.m
grid
fp.1
71.sw
im
fp.1
78.ga
lgel
Patho
logi
cal V
irus
Application
Po
we
r
2.0G Power Average
2.0G Power Peak
Measure Montecito
core power vs.
application
FP code
Integer code
The Power Problem� Power consumption is a primary limiter in
today�s processors and unfortunately, it
varies a lot
� Part to part
(processing)
� As a result of
the application
� Due to
temperature
4
Current Approaches to Power
Management and Reduction
� Split out the thermal power spec from the max electrical power spec� Use �Thermal Design Power� (TDP) to spec a sustained power
that is lower than the true maximum (�electrical power�)
� Counts on the rarity of very high power events
� Relies on a thermal sensor to throttle the part if it�s too hot
� Allows a lower cost thermal solution, but power supplies and power delivery must still handle the max electrical power
� Dynamic Voltage Scaling (C states/P states)� Conserve energy when the processor is under-utilized to reduce
average power
� Fuse in a Vcc that is part-specific� Higher power but faster parts can use a lower voltage at the
same frequency
5
The Ideal Power Management for
Servers and Desktop
� We currently over-design our power supplies and thermal solutions for worst case parts and applications
� Most of the time the part isn�t fully using the watts we�ve allocated for it� Lower power applications only run as fast as the
highest power ones
!We want to maximize performance / Watt for all situations
!We want a processor to adapt operating point dynamically to it�s situation
This is what Foxton Technology does
6
What is Foxton?
� An integrated system that dynamically
maximizes performance per watt including
� Accurate, integrated power measurement
� Integrated temperature measurement
� Frequency control to maximize hertz/volt
� A microcontroller to incorporate instantaneous
{power, temperature, voltage, frequency} and
optimize the operating point
� The result is processor cores doing their
computation at optimal power efficiency
7
High Level View of System
MicroMicro--ControllerController
PowerPower
SensorSensorSupplySupply
VRMVRM
10s of µs
100s of ps
ThermalThermal
SensorSensor
VoltageVoltage
SensorSensorVoltage to Freq.Voltage to Freq.
Converter Converter ClockClock
8
Power Consumption Contour
0.40.5
0.60.7
0.80.9
11.6
1.8
22.2
2.4
0
20
40
60
80
100
120
140
160
180
200
Power
(Watts)
Activity Factor
GHz= F
(V)
Optimization point is for
typical integerapplications which have
.6X the switching power
of the worst case
! Amdahl�s law
Manufacturing test is
accomplished by
observing the self measured power, and
the self-generated frequency for typical
code at the power limit
9
Measuring Power
� Use package resistance to measure power
� Avoids burning extra power in measurement
� Portable, self-contained solution� No dependence on external power supply
VConnectorVDie
RPackage
10
VID
6
Power Control System
+-
DAC+-
PLimit
VDie
IIR
PCalc Calc
VConnectorA/D
A/D
Power SupplyMicro-Controller
RPackage
Package/Die
11
Temperature
Measurement
� Calibrate the voltage drop at test to TJ target (90C)
� Use the known -1.7mV per degree C temperature coefficient to calculate die temperature
� Measure the voltage drop across the diodes every 20ms
Power
Supply
A/D
Converters
Micro
Controller
Die
VFixed
VThermal
6
VID
12
Package Resistance Calibration
� Package resistance can be computed with two voltage measurements with processor stalled� Pulling quiescent current I0� Pulling I0 + a precision, on-die generated current IDelta
� On-package precision R for consistent IDelta
� 66ms recalibration rate
I0 I0+IDelta
Vc2-Vd2
Vc1-Vd1
R Package
( ) ( )Delta
dcdcPackage
I
VVVVR 1122
−−−=
13
Frequency vs. Power Limit
Measured DataCore 0, Core 1, Avg Frequency vs. PLimit
84%
100%
100W 95W 90W 85W 80W 75W 70W 65W 60WPLimit
Fre
qu
en
cy
Core 0 Frequency Core 1 Frequency Avg Frequency
88%
92%
96%
A 31% power
redux for a 10%
frequency hit
14
Managing Frequency
� Voltage variability costs frequency and hence performance/watt
� A clock system that can track rapid voltage changes will both maximize hertz/volt and provide smooth response to micro-controller induced voltage changes
Vcc(t)
F(V(t))
Favg(t)
Fmin(t)Old FMAX
New
FMAX
Today: Minimum Vcc(t) determines maximum
frequency.
Foxton: Average Vcc(t) determines average
frequency.
15
A Variable Frequency Clock System
PLL
Bus Clock(200MHz)
(M=10)
Fmax (2.0GHz)
1/M
÷2
DFD
Foxton
µC
DFD
I/Os
DFD
Bus
Logic
DFD
Core Logic
RVD(+/-)
(D=16)
(D=0)D=f(P,V,T)
(.504*Fmax ≤ Fcore ≤ Fmax)
(1.6GHz)
(1.6GHz)
(1.0GHz)
(D=16)
Icore
MVRVID
VCORE
(dynamic)VID=f(Power,T)
VFC
L0
Rptr
SLCB
16
Montecito Clock System Floorplan
PLL / PLL /
translation translation
table / table /
clock clock
controlcontrol
FSB FSB
DFDsDFDs
Foxton Controller DFDFoxton Controller DFD
FSB FSB
DFDsDFDs
Core Core
DFDsDFDs
RVDsRVDs
CORE 1CORE 1
CORE 0CORE 0
Bus Logic DFDBus Logic DFD
17
Clock System Modes
� Fixed Frequency (FFM)� Cores/Uncore are frequency and phase aligned
� Cores/Uncore interfaces synchronous
� Variable Frequency (VFM)� Core supply modulated by Foxton Controller to
manage power envelope
� Core frequencies track Vcore via Regional Voltage Detector (RVD) V-F curves
� Respond to Foxton modulation and local transients
� V-F curves match worst-scaling paths on chip
� Core/Uncore interfaces asynchronous
18
RVD Delay Line
FINE
inR R
COARSE
R RRR RR R
0 1 2 3 4 5 6 7
Dynamic Mux
odd
coarse_sel
F FF F F F F F F
FET
0
1
inout
2.00E-10
4.00E-10
6.00E-10
8.00E-10
1.00E-09
1.20E-09
1.40E-09
1.60E-09
0.60 0.70 0.80 0.90 1.00 1.10 1.20 1.30 1.40
RVD Delay Element
metal1 resistor
19
Example VFC Supply Droop Response
DFD
Output
Clock
Vcore
RVD
Delay
Line
Clock
Period
�UP� to
DFD
Clock period increased
No Adjust needed this cycle
Droop increases RVD delay line delay
Increased delay asserts period �UP� for one cycle
1 2 3 4 5
20
Speed gains from Adaptive Frequency
Speed Gain From VFC
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
7.0%
Test 1 Test 2 Test 3
21
Summary
� Foxton is a system comprised of several key components� Accurate power and temperature measurement
� Fine grained voltage control
� Dynamic fast-response frequency control
� A micro-controller to manage the system
� It can be wrapped around any processor or ASIC which can be virtually unchanged except:� An asynchronous interface to the rest of the system
� Must support a wider range of operating voltages
� The result is a self-optimizing chip dynamically delivering greatly improved performance/watt