32
ASP-DAC 2006 ASP ASP - - DAC 2006 DAC 2006 Kaustav Banerjee, Sheng-Chih Lin and Navin Srivastava Kaustav Banerjee, Sheng Kaustav Banerjee, Sheng - - Chih Lin and Navin Srivastava Chih Lin and Navin Srivastava Electrothermal Engineering in the Nanometer Era: From Devices and Interconnects to Circuits and Systems Electrothermal Engineering Electrothermal Engineering in the Nanometer Era: in the Nanometer Era: From Devices and Interconnects to From Devices and Interconnects to Circuits and Systems Circuits and Systems University of California, Santa Barbara University of California, Santa Barbara University of California, Santa Barbara

Electrothermal Engineering in the Nanometer Era: From Devices … · 2006. 2. 2. · ASPASP-DAC-DAC 2006 2006 Kaustav BanKaustav Baneerjee, Shrjee, Shenenggg-Chih --Chih Lin aLin

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • ASP-DAC 2006ASPASP--DAC 2006DAC 2006

    Kaustav Banerjee, Sheng-Chih Lin and Navin SrivastavaKaustav Banerjee, ShengKaustav Banerjee, Sheng--Chih Lin and Navin SrivastavaChih Lin and Navin Srivastava

    Electrothermal Engineeringin the Nanometer Era:

    From Devices and Interconnects to Circuits and Systems

    Electrothermal EngineeringElectrothermal Engineeringin the Nanometer Era:in the Nanometer Era:

    From Devices and Interconnects to From Devices and Interconnects to Circuits and SystemsCircuits and Systems

    University of California, Santa BarbaraUniversity of California, Santa BarbaraUniversity of California, Santa Barbara

  • OutlineOutlineOutlineWhat is Electrothermal Engineering?What is Electrothermal Engineering?What is Electrothermal Engineering?

    Micro-scale vs. Macro-scaleMicroMicro--scale vs. Macroscale vs. Macro--scalescale

    Temperature-Aware Integrated ApproachTemperatureTemperature--Aware Integrated ApproachAware Integrated Approach

    Emerging TechnologiesEmerging TechnologiesEmerging Technologies

    From Devices and Interconnects to Circuits and SystemsFrom Devices and Interconnects to Circuits and SystemsFrom Devices and Interconnects to Circuits and Systems

    core

    cache

    DRAM

    DRAM

    3-D IC Technology33--D IC TechnologyD IC Technology Hybrid CNT-Cu TechnologyHybrid CNTHybrid CNT--Cu TechnologyCu Technology

  • Micro-Scale vs. Macro-ScaleMicroMicro--Scale vs. MacroScale vs. Macro--ScaleScale

    Micro-Scale

    Mac

    ro-S

    cale

    Global View of IC Heat Transfer….Global View of IC Heat Transfer….

  • DeviceDeviceDevice

    -40 -20 0 20 40 60 80 100Operating Temperature (°C)

    I on/ I

    off

    Rat

    io

    103104

    105

    106

    107

    108109

    P-MOSFETN-MOSFET

    PD-SOI (120nm)N-FETP-FET

    45 nm45 nm

    °C )

    Leakage increases……Leakage increases……

    Degrades Performance…..Degrades Performance…..Degrades Reliability…..Degrades Reliability…..

    ∆T In

    crea

    se (K

    )

    Dissipated Pow

    er (µW/µm

    )

    Self-Heating increases……Self-Heating increases……

    ESD FailureESD Failure

    Pop et al. IEDM 2001

    Lin et al. IEDM 2005

    Banerjee et al. IEDM 2003

    (Texas Instruments)

  • InterconnectInterconnectInterconnect

    Ryu et al. IRPS 1997

    Electromigration failureElectromigration failure

    IBM

    20 µm ULSI Metallization

    Number of metal layers increases….Current density increases….Number of metal layers increases….Current density increases….

    209 °C

    126 °C

    Global WiresGlobal Wires

    Im and Banerjee, IEDM 2000

    Back-end thermal Profile!!Back-end thermal Profile!!

    ESD failureESD failure

    Low-k dielectrics increase self-heatingLow-k dielectrics increase self-heating

    Banerjee et al. IRPS 2000

    50 nm

    Im and Banerjee, IEDM 2000

  • CircuitCircuitCircuitD

    istr

    ibut

    ion

    Temp(oC)

    Core

    Cache 70ºC

    120ºCCourtesy of S. Borkar, Intel

    Chip temperatures are non-uniform…Chip temperatures are non-uniform…Increased delay and variance….Increased delay and varianceIncreased delay and variance……..

    Ajami et al., TCAD 2005, JAICSP 2005

    Lin et al. IEDM 2005

    Wire delay and clock skewWire delay and clock skewBuffer insertionBuffer insertionVoltage drop

    ReliabilityReliabilityLeakage and yield estimationLeakage and yield estimationPower/performance optimization

    Zhang et al. ISLPED 2004Lin et al. ICCD 2005

  • SystemSystemSystemPower density increases……Power density increases……

    source : Intel, AMD

  • SystemSystemSystemImpact of higher power dissipation and density (hot-spots) on system

    System Performance and StabilitySystem Performance and Stabilitydegrades performancedegrades performanceHeatHeat--induced failure and instabilityinduced failure and instability

    Product Lifetime and ReliabilityProduct Lifetime and ReliabilityMost reliability mechanisms are highly temperature sensitiveMost reliability mechanisms are highly temperature sensitive

    Operating CostOperating CostMore complex cooling solutions

    M. Miller, AMD

    More complex cooling solutions

  • Electrothermal EngineeringElectrothermal EngineeringElectrothermal EngineeringTemperature awareness at every level…Integrated approach…Temperature awareness at every levelTemperature awareness at every level……Integrated approachIntegrated approach……

    - Self-Heating- Leakage mechanisms- Reliability (EM, ESD)

    Device / Interconnect level

    - Timing- Placement / Routing- Buffer insertion

    Circuit level

    - Packaging / Cooling solutions- System power dissipation- System performance

    System level

    Active PowerDevice

    Interconnect

    Leakage Power

    Device

    Layout geometry

    PackagingCooling

    Full chip power dissipation and

    temperature profile estimation

    Fourier's Law – Heat TransferMicrostructure Heat Transfer

    Temperature-Aware Design

    3D ICs

    MEMS

    SET

    CNT

    Wider implications for emerging technology

  • Interconnect Scaling EffectsInterconnect Scaling EffectsInterconnect Scaling Effects

    Im et al., IEEE TED, Dec. 2005

    0

    1

    2

    3

    4

    5

    Total

    Res

    istiv

    ity [µ

    Ω-c

    m]

    Technology Node [nm]

    Barrier Layer Effect Surface Scattering Grain Boundary Scattering Background Scattering (ρo)

    2232456590

    At 300 K

    0

    1

    2

    3

    4

    5

    Total

    Res

    istiv

    ity [µ

    Ω-c

    m]

    Technology Node [nm]

    Barrier Layer Effect Surface Scattering Grain Boundary Scattering Background Scattering (ρo)

    2232456590

    At 300 K

    High resistance diffusion barrier

    layer (TaN)TEM cross-section of

    narrow Cu interconnect

    Barrier occupies 20%-25% of intended Cu cross-section area

    Size effect on wire resistivity……Size effect on wire resistivity……

    1 2 3 4

    0.0

    0.4

    0.8

    1.2

    1.6 DEM Model (Xerogel) PWSM Model (Xerogel)

    Ther

    mal

    Con

    duct

    ivity

    [W/(m

    -K)]

    Dielectric Constant

    FSG HSQ CDO Polymer MSQ Xerogel BruggemannBruggemann Effective Medium Theory: relates dielectric Effective Medium Theory: relates dielectric

    constant to porosityconstant to porosity

    DEM or PWSM Models: relate thermal DEM or PWSM Models: relate thermal conductivity to porosityconductivity to porosity

  • Back-end Thermal IssuesBackBack--end Thermal Issuesend Thermal Issues

    Cur

    rent

    Den

    sity

    (MA

    /cm

    2 )

    0 1 2 3 4 5 6

    0

    100

    200

    300

    400

    500

    22 nm 32 nm 45 nm 65 nm 90 nm∆

    T [o C

    ]

    Distance from Substrate (z-direction) [µm]

    Technology Node

    z

    0

    Tamb=45 oC

    Tj=85 oC

    Tmax

    z

    0

    Tamb=45 oC

    Tj=85 oC

    Tmax

    Im et al., IEEE TED, Dec. 2005 Im et al., IEDM 2002; Srivastava et al., VMIC 2004

    High temperature will become a major concern for interconnect High temperature will become a major concern for interconnect reliability: maximum current density will be severely limitedreliability: maximum current density will be severely limitedAccurate interconnect thermal profile important for various Accurate interconnect thermal profile important for various analysis: delay, skew, IRanalysis: delay, skew, IR--drop etcdrop etc

  • Electrothermal Effects in Nanoscale Devices

    Electrothermal Effects in Electrothermal Effects in NanoscaleNanoscale DevicesDevices

    electron

    phonon

    Λhigh

    electron energy

    Traditional CMOS scaling assumes isothermal problem

    High E field at drain hot electrons

    Phonon hot spot near drain (as Λ > L)

    Affects device behavior

    Need to solve phonon Boltzmann Transport Equation (BTE)

    Source

    Gate

    Drain

    Λ ~ 300 nmin Silicon at Room

    Temperature

    L

  • Device Temperature ProfileDevice Temperature ProfileDevice Temperature ProfileLocal hot-spot temperature rises well beyond the diffusion theory prediction…..

    T (K

    )

    y (nm) x (nm)x (nm)

    T (K

    )

    Source

    Gate

    Drainx

    y

    Pop et al. IEDM 2001

    Indirectly verified against ESD failure pulses….Sverdrup et al. SISPAD 2000

    Important implications for device performance and leakage

    Critical for estimating failure conditions under ESD events

  • Implications for Emerging CMOSImplications for Emerging CMOSImplications for Emerging CMOS

    ITRS 2004

    Due to confined geometry and poor thermal conductivity materials, emerging CMOS devices will exhibit severe localized heating effects !!

    Ongoing Research:

    Collaboration with Stanford, IBM and TI

    Channel Region Other Region

    ITRS 2004

  • Circuit Level ET Issues (1)Circuit Level ET Issues (1)Circuit Level ET Issues (1)Sp

    read

    in D

    elay

    (% a

    ge)

    Impact of temperature variations on Impact of temperature variations on buffered interconnect systemsbuffered interconnect systems……....

    Wason and Banerjee, ISLPED 2005Wason and Banerjee, ISLPED 2005

    Temperature variation has a strong impact on both delay and leakage power…..

  • Circuit Level ET Issues (2)Circuit Level ET Issues (2)Circuit Level ET Issues (2)Impact of Substrate Thermal GradientsImpact of Substrate Thermal GradientsImpact of Substrate Thermal Gradients

    Delay/skew analysis for non-uniform interconnect temperature

    Delay/skew analysis for non-uniform interconnect temperature

    Ajami et al., DAC 2001

    T(x)T(x) T(x)T(x)

    xx xx

    Direction dependence of thermal gradient….Direction dependence of thermal gradient….

    Increasing thermal profile has better performance than that of decreasing thermal profile (optimal wire sizing)

    Better

  • Implications of Substrate Thermal Gradients

    Implications of Substrate Thermal Implications of Substrate Thermal GradientsGradients

    Impact on delay estimation…..Impact on delay Impact on delay estimationestimation…….... Impact on IR-drop analysis…..Impact on IRImpact on IR--drop analysisdrop analysis……....

    Worst-case voltage-drop (VIR/Vdd) increases in the presence of thermal gradients

    T1: positive exponential gradientT2: negative exponential gradientFor a fixed T_Low

    Ajami et al., JAICSP, 2005Ajami et al., TCAD 2005

  • Impact on Buffer InsertionImpact on Buffer InsertionImpact on Buffer InsertionBuffer movement in a 6660 um line (180 nm node)

    02468

    1012141618

    15 25 35 45 55 65 75Temperature gradient (C)

    Perf

    orm

    ance

    Imp.

    %

    0.180.130.1

    Delay improvement after thermally-aware buffer insertion

    Ajami et al., ICCAD 2001

  • System Level: ET-CouplingsSystem Level: ETSystem Level: ET--CouplingsCouplings

    source : Intel

    Supply Voltage

    Performance

    ThresholdVoltage

    SwitchingPower

    Temperature

    LeakagePower

    TotalPower

    Reliability Cooling CostBanerjee et al., IEDM 2003

  • Self-Consistent ET Analysis ToolSelfSelf--Consistent ET Analysis ToolConsistent ET Analysis Tool

    0 2000 4000 6000 8000 100000

    2000

    4000

    6000

    8000

    10000

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    2.2

    2.4

    x 10-6

    Die Length (µm)

    µW/µm2

    SocketPCB

    Heatsink

    Heat Spreader

    Substrate

    TIM 2TIM 1Core (die)

    Packaging / Cooling ModelLayout geometry & power Dissipation

    Self-Consistent Substrate Thermal Profile

    0 2000 4000 6000 8000 100000

    2000

    4000

    6000

    8000

    10000

    120

    122

    124

    126

    128

    130

    Die

    Wid

    th (

    µm

    )

    Electrothermal Couplings

    3D Electrothermally-Aware Spatial Temperature Estimation

    ( Including active and leakage power )

    Realisticpackaging structure

    Lin et al. IEDM 2005

  • Application-1: Full-Chip Leakage Estimation

    ApplicationApplication--1: Full1: Full--Chip Leakage Chip Leakage EstimationEstimation

    Case 1: Die-to-die channel length variations

    Case 2: Case1 + Within-die variations

    Case 3: Case 2 + Die-to-die temperature variations

    Zhang et al. ISLPED 2004

    Die-to-die temperature variations significantly increases the leakage power

  • Application-2: Power-Performance Tradeoff

    ApplicationApplication--2: Power2: Power--Performance Performance TradeoffTradeoff

    A shift in optimal point, EDP and performance contoursAn overall change in shape of EDP contoursOperation region restricted by electrothermal constraints

    A shift in optimal point, EDP and performance contoursAn overall change in shape of EDP contoursOperation region restricted by electrothermal constraints

    Vdd-Vth optimization using Energy-Delay Product (EDP)Vdd-Vth optimization using Energy-Delay Product (EDP)Mark Horowitz, 1997

    0.2 0.3 0.4 0.50.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Threshold Voltage Vth ( V )

    Supp

    ly V

    olta

    ge V

    dd( V

    )

    Traditional Optimal EDP

    1.0

    1.52.0

    0.9 0.80.7 0.6

    0.5

    Vdd=Vth

    Electrothermally Coupled EDP

    0.2 0.3 0.4 0.50.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Threshold Voltage Vth ( V )Su

    pply

    Vol

    tage

    Vdd

    ( V )

    Vdd=Vth

    Thermal Runaway

    0.9 0.80.7 0.6

    0.5 1.01.3

    1.60.1

    0.010.001

    0.0001

    Optimal EDP

    DAC 2004

  • Application-3: Leakage and Packaging Aware Design Space

    ApplicationApplication--3: Leakage and 3: Leakage and Packaging Aware Design SpacePackaging Aware Design Space

    0.2 0.3 0.4 0.50.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Supp

    ly V

    olta

    ge V

    dd( V

    )

    Threshold Voltage Vth ( V )

    Tj=40ºC

    Tj=60ºC

    Tj=80ºC

    Tj=100ºCΘj ↓

    Θj=0.85Θj=0.75Θj=0.65

    Optimal EDP shifts left when θj decreases

    Θj=0.85Θj=0.75Θj=0.65

    Thermal Runaway

    Vdd=Vth

    0.2 0.3 0.4 0.50.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Threshold Voltage Vth ( V )

    0.9

    0.9

    Tj=40ºC

    Tj=60ºC

    Tj=80ºCIoff X 3Ioff X 1

    Thermal Runaway

    Vdd=VthOptimal EDP shifts right when Ioff increases

    Optimal EDP

    Banerjee et al. IMAPS 2005

    Lowering of the junction temperature by employing advanced packaging and cooling techniques with lower thermal impedance (θj) will expand the design space

    While the leakage increases due to technology scaling or process variations, the operation region prohibited by thermal runaway expands

    Allows circuit designers to comprehend reliability and packaging constraints……

  • Application-4: Thermally-Aware Design-Specific Optimization

    ApplicationApplication--4: Thermally4: Thermally--Aware Aware DesignDesign--Specific OptimizationSpecific Optimization

    Different metrics result in different optimization……

    Supp

    ly V

    olta

    ge V

    dd( V

    )

    is bounded by thermal and performance requirementsμLin et al. ICCD 2005

    PTμ

    ratio of the exponents of delay over power

    Metric :

    PT2EDP: μ=2

    PDP: μ=1

    PDP: μ=0.5

    PT

    P2T

  • Application-5: Power Management

    ApplicationApplication--5: Power 5: Power ManagementManagement

    Efforts on Low Power….without hurting performanceEfforts on Low PowerEfforts on Low Power…….without hurting performance.without hurting performance

    Device EngineeringDevice EngineeringEnhanced Channel MobilityEnhanced Channel MobilityReduced Gate LeakageReduced Gate Leakage

    HighHigh--K Gate, Nitrogen Doped K Gate, Nitrogen Doped

    Circuit LevelCircuit LevelAdaptive bodyAdaptive body--biasing, Dual biasing, Dual VVththSleep Transistor, Clock/Power GatingSleep Transistor, Clock/Power Gating

    MicroMicro--Architecture LevelArchitecture LevelMultiMulti--CoreCore

  • Why CoolingWhy CoolingWhy CoolingPower

    Dissipation

    CoolingCircuit

    Techniques

    Cooling is the Knob !

  • Device and Circuit Level BenefitDevice and Circuit Level BenefitDevice and Circuit Level Benefit

    •• Enhance IEnhance Ionon to Ito Ioffoff ratioratio

    •• Reduce propagation delay and varianceReduce propagation delay and variance•• Benefit backBenefit back--end performance and reliability

    -40 -20 0 20 40 60 80 100Operating Temperature (°C)

    I on/ I

    off

    Rat

    io

    103104

    105

    106

    107

    108109

    P-MOSFETN-MOSFET

    PD-SOI (120nm)N-FETP-FET

    45 nm45 nm

    Dis

    trib

    utio

    n

    9-stage Inverter Chain

    Lowering temperature S. Borkar, IntelN

    orm

    aliz

    ed F

    requ

    ency

    end performance and reliability Lin et al. IEDM 2005

  • Cooling Benefit-Cost TradeoffCooling BenefitCooling Benefit--Cost TradeoffCost Tradeoff

    Operating Temperature (°C)20 40 60 80 100

    50

    60

    70

    80

    90 Module (90 nm)

    Pcooling

    Pleakage

    System PowerChip PowerActive Power

    The limit occurs at a lower temperature as technology scales

    Beyond this point, further cooling does not lead to any power saving

    Pow

    er D

    issi

    patio

    n (W

    )Lin et al. IEDM 2005

  • Hot-Spot ManagementHotHot--Spot ManagementSpot Management

    0 2000 4000 6000 8000 100000

    2000

    4000

    6000

    8000

    10000

    120

    122

    124

    126

    128

    130

    Die Length (µm)D

    ie W

    idth

    m)

    (°C)Substrate Temperature Profile

    Die

    Wid

    th (

    µm

    )

    118

    120

    122

    124

    126

    128

    130

    0 2000 4000 6000 8000 100000

    2000

    4000

    6000

    8000

    10000

    Die Length (µm)

    (°C)Substrate Temperature Profile

    Global vs. localized coolingGlobal vs. localized cooling

    (reduce θja 20%) (Using thin-film TEC)Size 0.8 mm X 0.8 mm

    Global cooling Localized coolingTMAX decreases but hot-spots remain

    Lin et al. IEDM 2005

    Localized cooling will be more effective for hot-spot management

  • ET Issues in 3D ICsET Issues in 3D ICsET Issues in 3D ICs

    Thermal G

    radient

    P chi

    p( W

    atts

    )

    P lea

    k( W

    atts

    )

    Banerjee et al. Proc. IEEE, 2001

    Performance evaluation of 3D design must account for negative impact of high temperature on all active layers

  • Hybrid Carbon Nanotube-Cu Interconnects

    Hybrid Carbon Hybrid Carbon NanotubeNanotube--Cu Cu InterconnectsInterconnects

    Srivastava et al. IEDM 2005

    For CNT bundles, the shaded region shows the range For CNT bundles, the shaded region shows the range

    1750 W/1750 W/mKmK < < KthKth < 5800 W/< 5800 W/mKmK

    Maximum interconnect temperature rise for Cu interconnect stackwith Cu vias compared to CNT bundle vias integrated with Cu interconnects

  • ConclusionsConclusionsConclusionsElectrothermal effects are increasing at every level Electrothermal effects are increasing at every level from devices and interconnects to circuits and from devices and interconnects to circuits and systemssystems ------need careful modeling and optimizationneed careful modeling and optimization

    Electrothermal Engineering is a critical needElectrothermal Engineering is a critical need……....

    ------temperature awareness at every level to optimize temperature awareness at every level to optimize performance, power and reliabilityperformance, power and reliability

    ------ understand various couplings through an understand various couplings through an integrated approachintegrated approach

    Emerging technologiesEmerging technologies ------will be strongly affectedwill be strongly affected

    OutlineMicro-Scale vs. Macro-ScaleDeviceInterconnectCircuitSystemSystemElectrothermal EngineeringInterconnect Scaling EffectsBack-end Thermal IssuesElectrothermal Effects in Nanoscale DevicesDevice Temperature ProfileImplications for Emerging CMOSCircuit Level ET Issues (1)Impact of Substrate Thermal GradientsImplications of Substrate Thermal GradientsImpact on Buffer InsertionSystem Level: ET-CouplingsSelf-Consistent ET Analysis ToolApplication-1: Full-Chip Leakage EstimationApplication-2: Power-Performance TradeoffApplication-3: Leakage and Packaging Aware Design SpaceApplication-4: Thermally-Aware Design-Specific OptimizationEfforts on Low Power….without hurting performanceWhy CoolingDevice and Circuit Level BenefitCooling Benefit-Cost TradeoffHot-Spot ManagementET Issues in 3D ICsHybrid Carbon Nanotube-Cu InterconnectsConclusions