Unit5 Memory EE577A Nazarian Spring12

  • Upload
    abhz

  • View
    229

  • Download
    1

Embed Size (px)

Citation preview

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    1/56

    EE577A

    VLSI System Design

    Memory Design 

    University of Southern California

    Viterbi School of Engineering

    Shahin Nazarian Spring 2012

    References: syllabus textbooks, Slides and notes from

    Professor Pedram, online resources

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    2/56

    Shahin Nazarian/EE577A/Spring 2012

    Digital Memories Types

    2

    Memory Arrays

    Random Access Memory Serial Access Memory Content Addressable Memory

    (CAM)

    Read/Write Memory

    (RAM)

    (Volatile)

    Read Only Memory

    (ROM)

    (Nonvolatile)

    Static RAM

    (SRAM)

    Dynamic RAM

    (DRAM)

    Shift Registers Queues

    First In

    First Out

    (FIFO)

    Last In

    First Out

    (LIFO)

    Serial In

    Parallel Out

    (SIPO)

    Parallel In

    Serial Out

    (PISO)

    Mask ROM Programmable

    ROM

    (PROM)

    Erasable

    Programmable

    ROM

    (EPROM)

    Electrically

    Erasable

    Programmable

    ROM

    (EEPROM)

    Flash ROM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    3/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: Random Access Technology

    • As the technology evolves randomness plays more important

    role and we want data to be randomly accessible• We also want the access time to not be a function of the

    location of the memory data

    • Memory can be classified into Random Access Memory (RAM)and non-RAM memories

    • Random Access Memories can be further classified intoROMs and Read/Write (R/W) memories

    • In RAM technology access time is the same regardless ofthe location of the memory data

    • R/W memory is also commonly called RAM due to historicalreasons

    • R/W (or RAMs) have two main types of Dynamic RAMs(DRAMs) and Static RAMs (SRAMs)

    3

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    4/56

    Shahin Nazarian/EE577A/Spring 2012 4

    Optional: Random Access: SRAM vsDRAM

    • Random Access:•

    DRAM: Dynamic Random Access Memory– High density, cheap, slow

    – Dynamic: need to be “refreshed” regularly 

    • SRAM: Static Random Access Memory

    – Low density, expensive, fast

    – Static: content will last “forever” (until lose power)

    – Typically lower power consumption when used at moderate and lowfrequencies; nearly negligible power when idle, however could be aspower-hungry as dynamic RAM, when used at high frequencies andbandwidths draws

    • “Not-so-random” Access Technology: 

    • Access time varies from location to location and from time totime

    • It’s randomly accessible, but it’s not exactly the same time 

    • Examples: Disk, CDROM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    5/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: DRAM

    • DRAM has high density compared to SRAM, however

    DRAM cell info is degraded due to junction leakagecurrent at the storage node, so cell data must be readand rewritten periodically (refresh operation). Due to lowcost and high density, DRAM is widely used for the mainmemory in personal and mainframe computers andworkstations

    • Example: 1T (one-transistor) DRAM cell consists of acapacitor to store binary 1 (high voltage) or 0 (lowvoltage) and a transistor to access the capacitor

    5

    1T DRAM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    6/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: SRAM

    • SRAM consists of a latch, so the cell data is kept as long

    as power is turned on and refresh operation is notrequired

    • SRAM is mainly used for the cache memory inmicroprocessors, main frame computers, engineering

    workstations and memory in hand-held devices due to highspeed and low power consumption

    • Example: 6T SRAM

    6

    6T SRAM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    7/56Shahin Nazarian/EE577A/Spring 2012

    Optional: Non Volatile Memory (NVM)

    • A memory that can hold the data even when not

    powered is referred to as NVM• Example are different types of ROMs such as Flash

    memory, magnetic memories such magnetic tapes andhard disks, optical discs, even punch cards!

    7

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    8/56Shahin Nazarian/EE577A/Spring 2012

    Optional: ROM

    • ROM allows only retrieval of previously stored data. No

    modification is permitted. ROMs are nonvolatilememories, i.e., the stored data is not lost even whenthe power supply is off and refresh operation is notrequired

    ROM is classified to Mask ROM and PROM • In Mask (Fuse) ROM, data is written during chip

    manufacturing by using a photo mask

    • In PROM the data is written electronically after

    the chip is fabricated

    Mask (Fuse) ROM

    8

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    9/56Shahin Nazarian/EE577A/Spring 2012

    Optional: ROM (Cont.)

    • PROM is classified to EPROM, and EEPROM

    Data written by blowing the fuse electrically cannot beerased and modified in Fuse ROM

    • Data in EPROM and EEPROM can be rewritten, but thenumber of subsequent re-writes is limited to 104-105

    In EPROM: ultraviolet rays that can penetrate through thecrystal glass on the package are used to erase whole data inchip simultaneously. Programming is done by higher thannormal voltages

    • In EEPROM higher than normal electrical voltage is used to

    program/erase data in 8 bit units• EEPROM drawback: slower write speed,

    in order of microseconds

    9EPROM, EEPROM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    10/56Shahin Nazarian/EE577A/Spring 2012

    Optional: ROM (Cont.)

    • ROMs are generally used for permanent (look-up)

    memory in printers, fax, game machines, and ID cards,due to lower cost than RAM

    • Ferroelectric RAM (FRAM) utilizes the hysteresischaracteristics of a ferroelectric capacitor to

    overcome the slow write operation of other EEPROMs • Flash ROM is similar to EEPROM and EPROM in using

    an array of floating gates (also referred to as cells). Asingle-level cell can store one bit of information,

    whereas a multi-level cells can store more than on bitof info by varying the number of electrons placed onthe floating gate of the cell. Similarly to EEPROM,higher than normal voltages are used to program/erasethe cells

    10

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    11/56Shahin Nazarian/EE577A/Spring 2012

    Optional: Memory Design Goals

    • The goal is to design memories that are larger, denser

    (more bits per area), faster (faster write and readoperation), more reliable, consume less power, andhave less design complexity

    • However some of these goals are contradictory, so

    compromises have to be made• Paradoxes of memory design

    –Denser and faster

    – Larger capacity and low power

    –Reduced complexity and high reliability

    11

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    12/56Shahin Nazarian/EE577A/Spring 2012

    Optional: Memory Design Goals (Cont.)

    • As we increase the memory capacity we also get sluggish

    access. To mitigate this, some architectural techniquesare used, e.g., memory partitioning where there are

    divided word lines, bit lines, etc.

    • Similarly higher capacity and denser designs result in

    higher power consumption (more specifically leakage) and

    to alleviate, architectures such as 6T are used to reduce

    the power

    Last, but not least, using lower voltage operation resultsin reliability issues, which are addressed by adding more

    transistors, using some architectural level techniques,

    using error correcting codes (ECC) such as parity bits 

    12

    O ti l F t C i B t

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    13/56Shahin Nazarian/EE577A/Spring 2012 13

    Optional: Feature Comparison BetweenMemory Types

    * FN Tunneling: Fowler-Nordheim tunneling

    HCI: Host Control Interface

    *

    ti t i

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    14/56Shahin Nazarian/EE577A/Spring 2012

    ptiona : emory eature omparison(Cont.)

    • Flash memories are the slowest, but compared to

    SRAMs, even DRAMs are considered slow• Also due to their technology, flash memories have

    limited number of reads and writes

    • However flash memories do not have the refresh

    circuitry and some other overheads of DRAM, so theyare denser

    • DRAM has the most volatile data retention, cause ofleakage and possibly destructive reads

    • In addition DRAM has poor scalability because byincreasing the number of bit lines and hence longerbit lines, the issue of charge sharing becomes moreprominent

    14

    O ti l M Hi h f M d

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    15/56Shahin Nazarian/EE577A/Spring 2012 15

    Optional: Memory Hierarchy of a ModernComputer System

    • Memory hierarchy has been a very successful concept in computerarchitecture design. It exploits the principles of (temporal andspatial) locality

    • Present the user with as much memory as is available in thecheapest technology

    • Provide access at speed offered by fastest technology

    Control

    Datapath

    Secondary

    Storage

    (Disk)

    Processor

    R e  gi   s  t   e r  s 

    MainMemory

    (DRAM)

    SecondLevel

    Cache

    (SRAM)

     On- Ch i   p

     C a c h  e 

    ones 10,000,000’s

    (10s ms)

    Speed (ns): tens hundreds

    100’s  G’s Size (bytes): K’s  M’s 

    Tertiary

    Storage

    (Tape)

    10,000,000,000’s

    (10s sec)T’s 

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    16/56Shahin Nazarian/EE577A/Spring 2012

    Optional: How is the hierarchy managed?

    Registers Memory• by compiler (programmer?)

    • cache memory•

    by the hardware•memory disks

    • by the hardware and operating system (virtualmemory)

    • by the programmer (files)

    16

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    17/56Shahin Nazarian/EE577A/Spring 2012

    Static Read-Write Memory (SRAM)

    17

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    18/56Shahin Nazarian/EE577A/Spring 2012 18

    SRAM vs. DRAM Summary

    • SRAM

    • Faster because bit linesare actively driven bythe D-Latch

    • Faster, simpler interface

    due to lack of refresh• Larger area for each cell

    which means less memoryper chip

    • Used for cache memories(and also register files)memory wherespeed/latency is key

    • DRAM

    • Slower becausepassive value (chargeon cap.) drives bl

    • Slower due to

    refresh cycles• Small area means

    much greaterdensity of cells and

    thus large memories• Used for main

    memories wheredensity is key

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    19/56Shahin Nazarian/EE577A/Spring 2012

    Typical SRAM Array

    19

    S I i D t i i th

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    20/56Shahin Nazarian/EE577A/Spring 2012

    Some Issues in Determining theMemory Array Organization

    • Typically we want an aspect ratio that is nearly one

    • How to divide up the row, column address decoding?

    Consider an 8K x 32 SRAM = 256 Kb = 218

     with 218

     = 29 rows x 29 columns as an example

    –Row decoder is 9 to 512 decoder. Every 32 (25)columns is a ‘word’, and we only need to decode

    words. So, column decoder needs to decode 16words, that is, we only need a 4 to 16 columndecoder

    20

    S I i D t i i th

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    21/56Shahin Nazarian/EE577A/Spring 2012

    Some Issues in Determining theMemory Array Organization (Cont.)

    • Assertion of word line accesses all cells in a row

    –Not all bits that are read from a row may beused

    – Loading on word line is high!

    • Bit lines connect all cells in a column, only one cellin a column can ever be ON at a time

    • Would like to keep the bitline swing low to preservepower

    21

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    22/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Cell

    22

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    23/56

    Shahin Nazarian/EE577A/Spring 2012

    Full CMOS (6-T) SRAM Cell

    • Very low standby power consumption, large noisemargin, low supply voltage

    Basic requirements for setting the (W/L) ratios:– Data-write operation is capable of modifying

    stored data in SRAM cell– Data-read operation does not modify stored data

    M3

    M4

    M1

    M2

    M6M5

    23

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    24/56

    Shahin Nazarian/EE577A/Spring 2012

    Layout of the CMOS SRAM Cell

    A different layout6T SRAM cell layout

    M1

    M2

    M3

    M4

    M5M6

    24

    M3

    M4

    M1

    M2

    M6M5

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    25/56

    Shahin Nazarian/EE577A/Spring 2012

    Static Bit Line Biasing

    25

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    26/56

    Shahin Nazarian/EE577A/Spring 2012

    Static Bit Line Biasing with Clamps

    26

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    27/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Cell w/ Static Bitline Pull-ups

    • When the word line is not selected, RS=0. M3 and M4are OFF

    If RS = 0 for ALL rows, the bit lines capacitancesC and NOT-C are charged-up to VDD by pull-up ofMP1 and MP2

    • Depending on application, MP1 and MP2 are turned

    OFF or are kept ON during the read operation

    pseudo

    27

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    28/56

    Shahin Nazarian/EE577A/Spring 2012

    CMOS SRAM Cell Design Strategy

    • Consider data-read operation with “0” stored in cell 

    28

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    29/56

    Shahin Nazarian/EE577A/Spring 2012

    Data-Read Operation

    • Conservative design constraint: V1,max ≤VT,2 to keep M2

    OFF during the read operation. M3 will be insaturation whereas M1 operates in the linear region:

      2 2,3 ,1

    1 , , 1 1 1 ,2( ) at2 2

    n n

     DD T n DD T n T n

    k k V V V V V V V V V  

    • A symmetrical condition also dictates the aspect ratios of M2 andM4

    ,3 , ,3

    2

    ,1 ,

    1

    2( 1.5 )

    ( 2 )

    n DD T n T n

    n DD T n

    k V V V   L

    W k    V V 

     L

    ,

    3

    2

    3 1

    1

    With 2.5 , 0.4 :

    2(1.9)(0.4)0.5

    (1.7)

     DD T nV V V V  

     L W W 

    W    L L

     L

     

    29

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    30/56

    Shahin Nazarian/EE577A/Spring 2012

    Read Operation (Cont.)

    30

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    31/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Column Read

    • Large signal sensing

    31

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    32/56

    Shahin Nazarian/EE577A/Spring 2012

    Data-Write Operation

    • Consider the write “0” operation assuming a logic

    “1” is already stored in the SRAM cell 

    32

    →  →  → 

    D ( )

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    33/56

    Shahin Nazarian/EE577A/Spring 2012

    Data-Write Operation (Cont.)

    • Design constraint: V1,max ≤VT,2 so M2 turns OFF when

    V1=VT,2. M3 is in linear region whereas M5 operates insaturation:

      2,52,3

    , 1 1 , 1 ,

    2

    ,,3

    ,5 , ,

    2( ) 0 at2 2

    ( )

    2( 1.5 )

     pn

     DD T n DD T p T n

     DD T pn

     p DD T n T n

    k k V V V V V V V V  

    V V k 

    k V V V  

    • A symmetrical condition also dictates the aspect ratios of M6

    and M4

    2

    ,3

    , ,

    5

    ( )

    2( 1.5 )

     p DD T p

    n DD T n T n

    V V  L

    W    V V V 

     L

     

     

    , ,

    23

    3 5

    5

    With 2.5 , 0.4 , 2.25

    1 (2.1) 1.32.25 2(1.9)0.4

    n

     DD T n T p

     p

    V V V V V  

     L   W W W    L L

     L

     

     

     

    33

    W O (C )

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    34/56

    Shahin Nazarian/EE577A/Spring 2012

    Write Operation (Cont.)

    34

    SRAM C l W i

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    35/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Column Write

    35

    SRAM Si i S

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    36/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Sizing Summary

    • Bls are high during read, they should not overpower theinverters during read, therefore nMOS transistors shouldbe strong to pull them down

    • However during write, the bls need to overpower, so wemake pMOS transistors weak

    bit bit_b

     med

     A

    weak

    strong

    med

     A_b

    word

    36

    T i l SRAM T i t Si

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    37/56

    Shahin Nazarian/EE577A/Spring 2012

    Typical SRAM Transistor Sizes

    • Transistors may be sized as

    follows:• nMOS pulldown:M1,M2: 6:2

    • pMOS pullup: M5, M6: 4:3

    • Access xtors: M3, M4: 4:2

    All boundaries are shared• Reduces the Write delay

    • One may also use equal-sizetransistors in the SRAM cell(e.g., 4:2 for all) however thisshould be carefully checked, asthis sizing is not conservativemay not work for all scenarios

    WL

    BitLine

    Bit_barLine

    M5

    M1

    M3

    Yet a different layout

    37

    M4

    M2

    M4

    E l D i f 256Kbit SRAM A

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    38/56

    Shahin Nazarian/EE577A/Spring 2012

    Example Design of a 256Kbit SRAM Array

    • 2 macro-blocks (aka 2banks), each with 256 rows

    and 512 columns (total 218 bits)

    • Want to access a double-word (26=64 bits) at a time

    • Need 12 address linesA0,…,A11• 4 LSB bits (A0,…,A3)

    are used for columnaddressing while the

    other 8 MSB bits(A4…A11) are used forrow addressing

    38

    Decoder SRAM Cells

    Macro#1SRAM Cells

    Macro#2

    Read Multiplexer 

    Sense Amplifier 

    Output Buffer 

    DFF

    Column Decoder 

    Sense_en

    Control Circuit

     Addr 

    clk

     Addr 

    wldummy

    prechargeRead_writeWrite_en

    Sense_en

    clk

    prechargeprecharge

     Addr  wldummy

    wlwl

    Output

    Write Multiplexer 

    Write CircuitWrite_en

    512 512

    1024

    1024

    64

    256 256

    512

    64

    4 16

    • Need 8:256 row decoder, 4:16 column decoder, and 16:1Read and Write Multiplexers

    Use 64 sense amplifiers

    S A lifi

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    39/56

    Shahin Nazarian/EE577A/Spring 2012

    Sense Amplifier

    • The bit line capacitance issignificant for large arrays

    • If each cell contributes 2fF,with 256 cells per column, we get512fF plus wire cap. Pull-downresistance is about 15K. The RCdelay will be 5.3ns (with V =VDD) ?

    • We cannot easily change R, C,or VDD, but can change V !

    • It is possible to reliably senseV’s as small as 50mV 

    • With margin for noise, most

    SRAMs sense bit-line swingsbetween 100~300mV

    • For writes, we still need todrive the bit line to full-swing

    • Only one driver needs to be this

    big39

    Use SPICE sweep function tooptimize transistor sizes(typically, Q0, Q5, Q6 areminimum-size transistors)

    Isolation

    Transistors

    Regenerative

    Amplifier

    Q1 Q2

    Q3 Q4

    Q5

    Q0

    Q6

    sense_en

    sense_en sense_en

    bit bit_bar

    out out_bar

    S A lifi W f

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    40/56

    Shahin Nazarian/EE577A/Spring 2012

    Sense Amplifier Waveforms

    bit

    bit_bar

    out

    out_bar

    sense_en sense_en

    40

    C t b t th S A D i

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    41/56

    Shahin Nazarian/EE577A/Spring 2012

    Comments about the Sense Amp Design

    • Isolation transistors must be pMOS•

    Bit lines are within 0.2V of VDD (not enough to turn an nMOStransistor ON)

    • Load on outputs of regenerative amplifier must beequal

    Need to precharge the sense amp before opening theisolation transistors to avoid discharging the bit lines

    Out Out_bar

    DataData_bar

    41

    • Both outputs go high duringprecharge– Usually follow the regenerative

    amplifier by a cross-coupledNAND latch

    • Requires 3 timing phases–

    Typically self-timed

    P h d W it Ci it

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    42/56

    Shahin Nazarian/EE577A/Spring 2012

    Precharge and Write Circuitry

    • Recall that M3 and M5 denotethe nMOS access transistorand the pMOS pullup transistorinside the SRAM cell (on thebit line side)

    • For successful write operation,

    R3+R9+R7 should be < ½R5• Let R* denote resistance of 2:2

    nMOS transistor, and n/p=2• If M3=4:2 and M5=4:3, then

    ½R5 = ¾R* and R3 =½R*;therefore, R9+R7 should be¼R* 

    • M9 and M7 should be 16:2each

    I

    42

    R D d

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    43/56

    Shahin Nazarian/EE577A/Spring 2012

    Row Decoder

    • Another example of pre-decoding addresses – decode in octaladdresses

    • One-level decoding of 9-bit address (A8 A7 … A0) requires 512 nine-inputAND gates

    • Predecode (A2 A1 A0), (A5 A4 A3), and (A8 A7 A6) by using 3*23=24

    three-input NAND gates, followed by 83=512 three-input NOR gates43

    Two implementations of a 4:16 decoder

    Requires 16 four-input AND gate

    Requires 8+16=24 two-input NAND/NOR gates

    4 t 1 T M ltipl x f R d Ci it

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    44/56

    Shahin Nazarian/EE577A/Spring 2012

    4-to-1 Tree Multiplexer for Read Circuitry

    44

    BL0 BL1 BL2 BL3

    pmos

    C l M

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    45/56

    Shahin Nazarian/EE577A/Spring 2012

    Precharge

    SRAM

    Cell

    Write-mux

    Write-en

    data

    4:16

    Decoder 

    Sense-enSense-en

    Sense-en

    Read-mux

    Decoder

    (8:256)

     A0

     A1

     A2

     A3

     A4

     A5

     A15

     15 Write_mux transistors

     15 Read_mux transistors

    60λ  

    30λ  

    20λ  

    20λ  

    60λ  

    60λ  

    160λ  

    160λ  

    20λ  

    20λ  

    20λ  

    4λ   4λ  

    4λ  

    30λ  

    20λ  

    A4

    A5

    A11

    A0

    A1

    A2

    A3

    To Read-mux’s 

    Column Mux• We have 16 read_mux and 16

    write_mux transistors in parallel

    • During read operation, one ofthe read MUX’s is selected,according to the values ofA0,…,A3 , and that columnenables the sense amplifierand the corresponding value

    of SRAM cell will be read atthe output of sense

    • During the write operation,the desired SRAM cell isselected and the data will bewritten into the correspondingSRAM cell

    • Need to replicate the drawing forthe bit_bar side

    • Need a total of 64 similarstructures which makes 16:1 64-

    bit wide column MUX 45

    SRAM Array Floor plan

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    46/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Array Floor plan

    Column Mux16:1

          D      E      C      O      D      E      R

    64 64

    r rows

    c columns

    ... 64 64

    r rows

    c columns

    ...

    64

    Sense Amplifier 

    Output buffer 

    SRAM Cells, Macro 1  SRAM Cells, Macro 2

    Decoder 

    Control

    10240

    20480

    10240

    Precharge Precharge 82

    670

    10000

    MUX 82

    Sense Amplifier  40

    Output buffer 1 40

    Output buffer 2

    1281900

    128

    Output Flip-Flop270

    All units are in

    46

    Example Read Delay Calculation for an

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    47/56

    Shahin Nazarian/EE577A/Spring 2012

    Example Read Delay Calculation for anSRAM Array

    • Consider a 256×512 SRAM core. Bit lines are pre-charged toV DD  = 2.5V  before each read operation. A read operation is

    complete when the bit line has discharged by 0.25V . A memorycell can provide 0.25mA  of pull-down current to discharge thebit line. Assume the word line resistance is 2Ω  per memory cell,the word line capacitance is 20fF per memory cell, while the bitline capacitance is 12fF  per cell. Ignore the bit-line resistance

    and read-mux transistor. Calculate the worst-case read delayfor this SRAM. Assume row decoding takes 3ns  while senseamplifier and output buffer take 1ns .

    • Solution: Each word line drives 512 SRAM cells; The RC delayfor driving the furthest cell is:

    • The time needed to discharge the bit or bit_bar line by 250mV  is:

    512

    1 1

    ( 1)0.69 0.69 0.69 256 513 20 2 3.72

     N k 

    row j k cell cell  

    k j

     N N t R C R C f ns

       

    256 12 0.253.1

    0.25

    10.8

    col 

    col 

    dis

    access dec row col sen buf    

    C V    f  t ns

     I m

    t t t t t ns

    D  

    47

    SRAM Scaling Challenges

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    48/56

    Shahin Nazarian/EE577A/Spring 2012

    SRAM Scaling Challenges

    • For cell stability, separate power rails for cell array

    vs. word line driver may be needed (bad forleakage)

    • Reduced read and write margins as we scalevoltages

    • Increased transistor leakage (high-k gate dielectric)• Introduction of various power management modes:

    • Reduced VDD•

    Raised VSS• Soft error immunity

    • Low standby power

    48

    Optional: Leakage Currents in the SRAM

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    49/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: Leakage Currents in the SRAMCell

    M1 M2

    M5 M6

    M3M4

    “0” “Vdd” 

    VDD 

    I sub3  

          I     s     u      b      5

          I     s     u      b   2

     

    I gate 1

    “0”  “0” 

    “Vdd” “Vdd” 

    1   3 2 5

    3 2 5   ,   ,( )

     gateleak sub sub sub

     sub sub sub   bitline   cell leak leak 

     I I I I I 

     I    I  I I    I 

    = + + +

    » + + = +

    49

    • Note that I leak  is dominated by the drain-source

    leakage in 90nm CMOS technology (i.e., we may ignoregate leakage and other leakage mechanisms which aresmall compared to the sub-threshold conductioncurrents.)

    Optional: Bitcell Stability Failures

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    50/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: Bitcell Stability Failures

    • Read Access Failure

    • The WL activation period is too short for a pre-specified ΔV

    to develop between bit line and bit_bar line in order totrigger the sense amplifier correctly during read

    – This may occur due to increase in Vt for the pass-gate orpull-down transistors

    • Read Stability Failure

    • Cell may flip due to increase in the “0” storage node abovethe trip voltage of the other inverter during a read

    – To quantify the bitcell's robustness against this failure,SNM is the most commonly used metric

    – Notice that read stability failure can occur anytime the

    WL is enabled even if the bitcell is not accessed for reador write operations

    – SNM related failures are the limiter for VDD scalingespecially after accounting for device degradation due tohot electron effects and negative bias temperature

    instability   50

    Optional: Read Stability Failure in the

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    51/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional Read Stability Failure in theSRAM Cell• A read stability (a.k.a. “hold”) failure occurs when

    stored data flips during the memory standby mode(while WL is enabled)• A cell’s VTC is composed of the two inverters’ VTCs that

    enclose two regions

    • The cell’s hold stability is characterized by the static noisemargin (SNM), which is measured by the diagonal length ofthe largest square fitted in the enclosed region (the derivationis omitted)

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    0 0.2 0.4 0.6 0.8 1

    SNM

    Ideal

    VTC

    Actua

    l VTC

    SNM

    Vout,LVin,R

    Vout,RVin,L 51

    The SNM butterflycurves must be analyzedfor different processcorners, FS: fastNMOS, slow PMOS andSF: slow NMOS, fastPMOS and differenttemperatures

    Optional: Bitcell Stability Failures (Cont )

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    52/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: Bitcell Stability Failures (Cont.)

    • Write Stability (or write-ability) Failure

    The internal “1” storage node may not be reduced belowthe trip point of the other inverter during the WLactivation period

    • One way to quantify a cell's write stability is to use writetrip voltage or write margin (WM), which is the maximumbit line voltage at which the bitcell flips state (assumingthat bit line is pulled to GND by the line driver)

    • Data Retention Failure

    • When VDD is reduced to the Data Retention Voltage, allsix transistors in the SRAM cell operate in subthresholdregion, hence, they show strong sensitivity to variations

    • PMOS transistor must provide enough current tocompensate for leakage in the NMOS pull-down and accesstransistors

    • Due to L and VT  variations, data retention current may notbe sufficient to compensate the leakage current

    52

    Optional: Minimum Voltage Needed to

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    53/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional Minimum Voltage Needed toPreserve Data• The Data Retention Voltage (DRV) is

    defined as the minimum VDD under which

    the data in a SRAM cell is stillpreserved

    • When VDD is reduced to DRV, alltransistors are in the sub-thresholdregion, thus SRAM data retention

    strongly depends on the sub-VT  current conduction behavior (i.e.,leakage)

    • Cell leakage is greatly reduced atDRV

    • This provides a highly effectiveleakage suppression scheme forstandby mode

    – Maximum leakage saving andminimum design overhead

    Distribution of DRV in a 0.13u

    CMOS with 3σ variations in VT

    and L

    Measured SRAM leakagecurrent 53

    Optional: DRV of SRAM (Cont )

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    54/56

    Shahin Nazarian/EE577A/Spring 2012

    Optional: DRV of SRAM (Cont.)

    When VDD scales down to DRV,the Voltage Transfer Curves(VTC) of the internal invertersdegrade to such a level thatStatic Noise Margin (SNM) of the

    SRAM cell is reduced to zero• The temperature coefficient of

    DRV is 0.169mV/°C, which impliesan increase of 12.3mV in DRVwhen temperature rises from

    27°C to 100°C 54

    DRVVwhen, DDinverter Right2

    1

    inverter Left2

    1

    DRV Condition:

    0 0.1 0.2 0.3 0.40

    0.1

    0.2

    0.3

    0.4

    V1 (V)

     

    VTC1

    VTC2

    VDD

    =0.18V

    VDD

    =0.4V

    VTC of SRAM cell inverters

    V DD 

    V  1 

     M  2  

     M  6  

     M  4  

     M  3  

     M  1

     M  5 V  2 

     Leakage current  

    V DD 

    V DD 

    0 0 

    Optional: Soft Error Rate for thell

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    55/56

    Shahin Nazarian/EE577A/Spring 2012

    • If collected charge Q s  exceeds some critical chargelevel Q 

    cr i t , it will upset bit value and cause a soft error

    • Soft Erro r Rate (SER) in SRAM:

    • Q cr i t 

     is 10fC in a 65nm CMOS process

    Optional Soft Error Rate for theSRAM Cell

    • A high-energy alpha particle or

    an atmospheric Neutron strikinga capacitive node• Deposits charge leading to a time-

    varying current injection at thenode

    • In case of atmosphericNeutrons:

    2( , ) exp( )

     s s s

    Q t t  I Q t 

    T T T  p

    -=

    exp( )crit 

     s

    QSE R

    Q

    -;

    0

    2040

    60

    80

    100

    120

    140

    0 50 100 150 200

    Time(ps)

       I   (   Q ,   t

       )   (  u   A   )

    55

    Optional: How to Mitigate the SER Fail Rate

  • 8/13/2019 Unit5 Memory EE577A Nazarian Spring12

    56/56

    Optional: How to Mitigate the SER Fail Rate

    • To mitigate soft errors, several radiation-hardeningtechniques can be implemented

    • Process technology changes (e.g., SOI technology)• Circuit design (e.g., adding capacitor, using larger

    transistors, memory words interleaving)

    • Architecture (e.g., parity, error correction codes)

    Good News: SER per bit value tend to decrease with scaling