Lec4a Supp

Embed Size (px)

Citation preview

  • 7/29/2019 Lec4a Supp

    1/43

    Chapter 4A: The Processor,

    Mary Jane Irwin ( www.cse.psu.edu/~mji )

    [Adapted from Computer Organization and Design, 4th Edition,

    CSE431 Chapter 4A.1 Irwin, PSU, 2008

    , ,

  • 7/29/2019 Lec4a Supp

    2/43

    Review: MIPS (RISC) Design Principles

    Simplicity favors regularity

    z fixed size instructions

    z

    z opcode always the first 6 bits

    Smaller is faster

    z limited instruction set

    z limited number of registers in register filez m e num er o a ress ng mo es

    Make the common case fast

    -z allow instructions to contain immediate operands

    CSE431 Chapter 4A.2 Irwin, PSU, 2008

    z

    three instruction formats

  • 7/29/2019 Lec4a Supp

    3/43

    The Processor: Datapath & Control

    z memory-reference instructions: lw, sw

    z arithmetic-logical instructions: add, sub, and, or, slt

    z control flow instructions:beq, j

    Generic implementation

    z use the program counter (PC) to supplythe instruction address and fetch the

    instruction from memory (and update the PC)

    FetchPC = PC+4

    DecodeExec

    z decode the instruction (and read registers)

    z execute the instruction

    registers

    CSE431 Chapter 4A.3 Irwin, PSU, 2008

    How? memory-reference? arithmetic? control flow?

  • 7/29/2019 Lec4a Supp

    4/43

    Aside: Clocking Methodologies

    element is valid and stable relative to the clockz State elements - a memory element such as a register

    z Edge-triggered all state changes occur on a clock edge

    Typical executionz read contents of state elements -> send values throu h

    combinational logic -> write results to one or more state elements

    Stateelement StateelementCombinationallogic

    clock

    one clock cycle

    Assumes state elements are written on every clockcycle; if not, need explicit write control signal

    CSE431 Chapter 4A.4 Irwin, PSU, 2008

    z write occurs only when both the write control is asserted and the

    clock edge occurs

  • 7/29/2019 Lec4a Supp

    5/43

    Fetching Instructions

    z reading the instruction from the Instruction Memory

    z updating the PC value to be the address of the next

    Add

    Instruction

    4

    Fetch

    PC = PC+4

    ReadAddress

    InstructionPCDecodeExec

    z PC is updated every clock cycle, so it does not need anexplicit write control signal just a clock signal

    CSE431 Chapter 4A.5 Irwin, PSU, 2008

    z Reading from the Instruction Memory is a combinational

    activity, so it doesnt need an explicit read control signal

  • 7/29/2019 Lec4a Supp

    6/43

    Decoding Instructions

    z sending the fetched instructions opcode and function field

    bits to the control unit

    Control

    Unit

    FetchPC = PC+4

    Read Addr 1Re ister

    Read

    DecodeExec

    and Instruction

    Write Data

    Read Addr 2

    Write AddrFile

    Data 1

    ReadData 2

    z reading two values from the Register File

    CSE431 Chapter 4A.6 Irwin, PSU, 2008

    - Register File addresses are contained in the instruction

  • 7/29/2019 Lec4a Supp

    7/43

    Executing R Format Operations

    , , , ,

    R-type:

    31 25 20 15 5 0

    o rs rt rd functshamt

    10

    z

    perform operation (op and funct) on values in rs and rtz store the result back into the Register File (into location rd)

    ALU controlRegWrite

    Instruction

    ea r

    Read Addr 2

    Write Addr

    Register

    File

    ReadData 1

    ReadALU

    overflow

    zero

    FetchPC = PC+4

    DecodeExec

    Write Data Data 2

    CSE431 Chapter 4A.7 Irwin, PSU, 2008

    z o e a eg s er e s no wr en every cyc e e.g. sw , so

    we need an explicit write control signal for the Register File

  • 7/29/2019 Lec4a Supp

    8/43

    Executing Load and Store Operations

    z compute memory address by adding the base register (read from

    the Register File during decode) to the 16-bit signed-extended

    z store value (read from the Register File during decode) written tothe Data Memory

    , ,File

    overflow

    ALU controlRegWrite MemWrite

    Instruction

    ea r

    Read Addr 2

    Write Addr

    Register

    File

    ReadData 1

    ReadALU

    Data

    Memory

    Address

    Read Data

    Write Data Data 2 r e a a

    Sign MemRead

    CSE431 Chapter 4A.8 Irwin, PSU, 2008

    Extend16 32

  • 7/29/2019 Lec4a Supp

    9/43

    Executing Branch Operations

    z compare the operands read from the Register File during decode

    for equality (zeroALU output)

    the 16-bit signed-extended offset field in the instrAdd Branch

    ALU control

    Shift

    left 2

    4address

    Read Addr 1

    RegisterRead

    Data 1

    zero

    PC

    (to branchcontrol logic)

    ns ruc on

    Write Data

    Write AddrFile

    ReadData 2

    ALU

    CSE431 Chapter 4A.9 Irwin, PSU, 2008

    Sign

    Extend16 32

  • 7/29/2019 Lec4a Supp

    10/43

    Executing Jump Operations

    z replace the lower 28 bits of the PC with the lower 26 bits of the

    fetched instruction shifted left by 2 bits

    Add

    4

    Read

    Instruction

    MemoryShift

    left 2

    Jumpaddress

    28

    Address 26

    CSE431 Chapter 4A.10 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    11/43

    Creating a Single Datapath from the Parts

    Assemble the datapath segments and add control linesand multiplexors as needed

    ,

    instructions in one clock cyclez no datapath resource can be used more than once per

    instruction, so some must be duplicated (e.g., separateInstruction Memory and Data Memory, several adders)

    z multiplexors needed at the input of shared elements withcontrol lines to do the selection

    z write signals to control writing to the Register File and DataMemor

    Cycle time is determined by length of the longest path

    CSE431 Chapter 4A.11 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    12/43

    Fetch, R, and Memory Access Portions

    MemtoReg

    Instruction

    Add

    4

    Read Addr 1

    ovf

    zero

    ALU controlRegWrite MemWriteALUSrc

    ReadAddress

    Instruction

    Memory

    PC

    Read Addr 2

    Write Addr

    Register

    File

    Data 1

    ReadData 2

    ALU

    Data

    Memory

    Write Data

    Read Data

    r e a a

    MemReadSign

    CSE431 Chapter 4A.12 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    13/43

    Adding the Control

    ,and Memory read/write)

    Controlling the flow of data (multiplexor inputs)

    R-type:

    31 25 20 15 5 0

    op rs rt rd functshamt

    10

    I-Type: op rs rt address offset

    serva ons

    z op field always

    in bits 31-26 31 25 0

    z addr of registersto be read arealways specified by the

    J-type: op target address

    rs e s - an r e s - ; or w an sw rs s e aseregister

    z addr. of register to be written is in one oftwo places in rt (bits 20-16)

    CSE431 Chapter 4A.13 Irwin, PSU, 2008

    - -

    z offset for beq, lw, and sw always in bits 15-0

  • 7/29/2019 Lec4a Supp

    14/43

    Single Cycle Datapath with Control Unit

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.14 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    15/43

    R-type Instruction Data/Control Flow

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.15 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    16/43

    Load Word Instruction Data/Control Flow

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.16 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    17/43

    Load Word Instruction Data/Control Flow

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.17 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    18/43

    Branch Instruction Data/Control Flow

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.18 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    19/43

    Branch Instruction Data/Control Flow

    Add

    4 ShiftAdd

    PCSrc

    0

    1

    MemWriteMemReadMemtoReg

    ALUSrc

    ALUOp

    Control

    UnitInstr[31-26]

    Branch

    Instruction Read Addr 1Read

    ovf

    RegWrite

    Address

    RegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    Data 1

    ReadData 2

    ALU

    zeroData

    Memory

    Write Data

    Read Data 1

    1

    0

    00

    Instr[20-16]

    Instr[15

    Sign

    Extend16 32ALU

    control

    Instr[15-0]

    -11]

    CSE431 Chapter 4A.19 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    20/43

    Adding the Jump Operation

    Instr 25-0

    Add

    4 ShiftAdd

    0

    1

    Shift

    left 2

    0

    3226PC+4[31-28]

    28

    MemWrite

    MemReadMemtoReg

    ALUSrc

    left 2 PCSrcALUOp

    Control

    UnitInstr[31-26]

    Branch

    Jump

    Read Addr 1

    ovf

    RegWriteRegDst

    Instr[25-21]

    ReadAddress

    Instr[31-0]

    ns ruc on

    Memory

    PC

    Read Addr 2

    Write Addr

    Register

    File

    ReadData 1

    ReadALU

    zeroData

    Memory

    Address

    Read Data 1

    1

    0

    0

    Instr[20-16]

    Write Data

    Sign

    ExtendALU

    1

    Instr[15-0]

    ns r-11]

    CSE431 Chapter 4A.20 Irwin, PSU, 2008

    Instr[5-0]

  • 7/29/2019 Lec4a Supp

    21/43

    Instruction Times (Critical Paths)

    delays for muxes, control unit, sign extend, PC access,shift left 2, wires, setup and hold times except:

    z Instruction and Data Memory (200 ps)

    zALU and adders (200 ps)

    Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

    -type

    load

    store

    beq

    CSE431 Chapter 4A.21 Irwin, PSU, 2008

    jump

  • 7/29/2019 Lec4a Supp

    22/43

    Instruction Critical Paths

    delays for muxes, control unit, sign extend, PC access,shift left 2, wires, setup and hold times except:

    z Instruction and Data Memory (200 ps)

    zALU and adders (200 ps)

    Instr. I Mem Reg Rd ALU Op D Mem Reg Wr Total

    -type

    load

    200 100 200 100 600

    200 100 200 200 100 800

    store

    beq

    200 100 200 200 700

    200 100 200 500

    CSE431 Chapter 4A.22 Irwin, PSU, 2008

    jump 200 200

  • 7/29/2019 Lec4a Supp

    23/43

    Single Cycle Disadvantages & Advantages

    ses e c oc cyc e ne c en y e c oc cyc e musbe timed to accommodate the slowest instructionz especially problematic for more complex instructions like

    floating point multiply

    Cycle 1 Cycle 2

    lw sw Waste

    May be wasteful of area since some functional units. .,

    shared during a clock cycle

    but

    CSE431 Chapter 4A.23 Irwin, PSU, 2008

    Is simple and easy to understand

  • 7/29/2019 Lec4a Supp

    24/43

    How Can We Make It Faster?

    Start fetching and executing the next instruction beforethe current one has completed

    z Pipelining (all?) modern processors are pipelined forperformance

    z Rememberthe performance equation:CPU time = CPI * CC * IC

    Underideal conditions and with a large number of

    instructions, the speedup from pipelining is

    z A five stage pipeline is nearly five times faster because the CC isnearly five times faster

    Fetch (and execute) more than one instruction at a time

    CSE431 Chapter 4A.24 Irwin, PSU, 2008

    z upersca ar process ng s ay une

  • 7/29/2019 Lec4a Supp

    25/43

    The Five Stages of Load Instruction

    Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

    e c ec xec emw

    IFetch: Instruction Fetch and Update PC

    Dec: Registers Fetch and Instruction Decode Exec: Execute R-type; calculate memory address

    Mem: Read/write the data from/to the Data Memor

    WB: Write the result data into the register file

    CSE431 Chapter 4A.25 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    26/43

    A Pipelined MIPS Processor

    completed

    z improves throughput - total amount of work done in a given time

    z instruction latency (execution time, delay time, response time -

    time from the start of an instruction to its completion) is notreduced

    Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 7Cycle 6 Cycle 8

    sw IFetch Dec Exec Mem WB

    - IFetch Dec Exec Mem WB

    - clock cycle (pipeline stage time) is limited by the slowest stage

    - for some sta es dont need the whole clock c cle e. . WB

    CSE431 Chapter 4A.26 Irwin, PSU, 2008

    - for some instructions, some stages are wasted cycles (i.e.,

    nothing is done during that cycle for that instruction)

  • 7/29/2019 Lec4a Supp

    27/43

    Single Cycle versus Pipeline

    Clk

    Single Cycle Implementation (CC = 800 ps):Cycle 1 Cycle 2

    lw sw Waste

    lw IFetch Dec Exec Mem WB

    Pipeline Implementation (CC = 200 ps): ps

    IFetch Dec Exec Mem WBR-type

    To complete an entire instruction in the pipelined casetakes 1000 ps (as compared to 800 ps for the singlec cle case . Wh ?

    CSE431 Chapter 4A.27 Irwin, PSU, 2008

    How long does each take to complete 1,000,000 adds ?

  • 7/29/2019 Lec4a Supp

    28/43

  • 7/29/2019 Lec4a Supp

    29/43

    MIPS Pipeline Datapath Additions/Mods

    IF:IFetch ID:Dec EX:Execute MEM:

    MemAccess

    WB:

    WriteBack

    IF/ID ID/EX EX/MEM

    Add

    4

    Read Addr 1

    Shift

    left 2

    Add MEM/WB

    ReadAddress

    Memory

    PC

    Read Addr 2

    Write Addr

    eg s er

    File

    eaData 1

    ReadData 2

    ALU

    Memory

    Address ReadData

    r e a a

    16 32

    Sign

    Extend

    CSE431 Chapter 4A.29 Irwin, PSU, 2008

    System Clock

  • 7/29/2019 Lec4a Supp

    30/43

    MIPS Pipeline Control Path Modifications

    z and held in the state registers between pipeline stages

    PCSrc

    IF/ID

    EX/MEM

    Control

    Instruction

    4

    Read Addr 1Register Read

    Shift

    left 2

    Add

    Data

    MEM/WB

    RegWriteBranch

    ReadAddressP

    C

    Write Data

    Read Addr 2

    Write AddrFile

    Data 1

    ReadData 2

    ALU

    emory

    Address

    Write Data

    ReadData

    MemtoRegALUSrc

    16 32

    Sign

    Extend

    ALU

    cntrlMemRead

    ALUOp

    CSE431 Chapter 4A.30 Irwin, PSU, 2008RegDst

  • 7/29/2019 Lec4a Supp

    31/43

    Pipeline Control

    IF Stage: read Instr Memory (always asserted) and writePC (on System Clock)

    EX Stage MEM Stage WB Stage

    Reg

    Dst

    ALU

    Op1

    ALU

    Op0

    ALU

    Src

    Brch Mem

    Read

    Mem

    Write

    Reg

    Write

    Mem

    toRegR 1 1 0 0 0 0 0 1 0

    l w 0 0 0 1 0 1 0 1 1

    sw

    beq X 0 1 0 1 0 0 0 X

    CSE431 Chapter 4A.31 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    32/43

    Graphically Representing MIPS Pipeline

    AL

    UIM Reg DM Reg

    z How many cycles does it take to execute this code?

    z What is the ALU doing during cycle 4?

    z Is there a hazard, why does it occur, and how can it be fixed?

    CSE431 Chapter 4A.32 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    33/43

    Why Pipeline? For Performance!

    me c oc cyc es

    A Once the

    I

    ns

    Inst 1

    LUeg eg

    AL

    IM Re DM Re

    p pe ne s u ,

    one instructionis completed

    r.

    O Inst 2

    U

    A

    LUIM Reg DM Reg

    ,CPI = 1

    r

    d

    e Inst 3ALUIM Reg DM Reg

    Inst 4ALUIM Reg DM Reg

    CSE431 Chapter 4A.33 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    34/43

    Can Pipelining Get Us Into Trouble?

    Yes: Pipeline Hazardsz structural hazards: attempt to use the same resource by two

    different instructions at the same time

    z data hazards: attempt to use data before it is ready

    - An instructions source operand(s) are produced by a prior

    z control hazards: attempt to make a decision about programcontrol flow before the condition has been evaluated and the

    - branch and jump instructions, exceptions

    Can usually resolve hazards by waiting

    z i eline control must detect the hazard

    CSE431 Chapter 4A.34 Irwin, PSU, 2008

    z and take action to resolve hazards

  • 7/29/2019 Lec4a Supp

    35/43

    A Single Memory Would Be a Structural Hazard

    me c oc cyc es

    A Reading data fromI

    ns

    Inst 1

    LUem eg em eg

    AL

    Mem Re Mem Re

    memory

    r.

    O Inst 2

    U

    A

    LUMem Reg Mem Reg

    r

    d

    e Inst 3ALUMem Reg Mem Reg

    Inst 4ALUMem Reg Mem RegReading instruction

    from memor

    CSE431 Chapter 4A.35 Irwin, PSU, 2008

    Fix with separate instr and data memories (I$ and D$)

  • 7/29/2019 Lec4a Supp

    36/43

    How About Register File Access?Time clock c cles

    AL

    IM Reg DM Reg

    Fix register fileaccess hazard badd$1,

    n

    s

    t Inst 1

    U

    ALUIM Reg DM Reg

    doing reads in the

    second half of thecycle and writes in

    r.

    O Inst 2AL

    UIM Reg DM Reg

    the first half

    d

    e

    r

    ALUIM Reg DM Regadd $2,$1,

    CSE431 Chapter 4A.36 Irwin, PSU, 2008

    clock edge that controlsregister writing

    c oc e ge a con ro sloading of pipeline state

    registers

  • 7/29/2019 Lec4a Supp

    37/43

    Register Usage Can Cause Data Hazards

    A

    I

    ns

    ,

    sub $4 $1 $5

    LUeg eg

    AL

    IM Re DM Re

    r.

    O and $6,$1,$7

    U

    A

    LUIM Reg DM Reg

    r

    d

    e or $8,$1,$9ALUIM Reg DM Reg

    xor $4,$1,$5ALUIM Reg DM Reg

    CSE431 Chapter 4A.37 Irwin, PSU, 2008

    Read before write data hazard

  • 7/29/2019 Lec4a Supp

    38/43

    Register Usage Can Cause Data Hazards

    A

    LU

    AL

    IM Reg DM Reg

    ,

    sub $4 $1 $5

    ALUIM

    RegDM Reg

    and $6,$1,$7ALUIM Reg DM Regor $8,$1,$9

    ALUIM Reg DM Regxor $4,$1,$5

    CSE431 Chapter 4A.38 Irwin, PSU, 2008

    Read before write data hazard

  • 7/29/2019 Lec4a Supp

    39/43

    Loads Can Cause Data Hazards

    A

    I

    ns

    ,

    sub $4 $1 $5

    LU

    AL

    IM Reg DM Reg

    r.

    O and $6,$1,$7

    ALUIM

    RegDM Reg

    r

    d

    e or $8,$1,$9ALUIM Reg DM Reg

    xor $4,$1,$5

    ALUIM Reg DM Reg

    CSE431 Chapter 4A.39 Irwin, PSU, 2008

    Load-use data hazard

  • 7/29/2019 Lec4a Supp

    40/43

    Branch Instructions Cause Control Hazards

    A

    I

    ns

    LUeg eg

    AL

    IM Re DM Re

    r.

    O Inst 3

    U

    A

    LUIM Reg DM Reg

    r

    d

    e Inst 4ALUIM Reg DM Reg

    CSE431 Chapter 4A.40 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    41/43

    Other Pipeline Structures Are Possible

    z Make the clock twice as slow or

    z let it take two cycles (since it doesnt use the DM stage)

    A

    MUL

    U

    What if the data memor access is twice as slow asthe instruction memory?

    z make the clock twice as slow or

    ALUIM Reg DM1 RegDM2

    clock rate)

    CSE431 Chapter 4A.41 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    42/43

    Other Sample Pipeline Alternatives

    ARM7IM Reg EX

    IM access reg

    access

    DM access

    shift/rotatecommit result

    ALUIM1 IM2 DM1

    Reg

    DM2Reg SHFT

    PC updateBTB access

    decodere 1 access ALU o

    DM write

    start IM accessIM access

    shift/rotatereg 2 access

    start DM accessexception

    CSE431 Chapter 4A.42 Irwin, PSU, 2008

  • 7/29/2019 Lec4a Supp

    43/43

    Summary

    All modern day processors use pipelining

    Pipelining doesnt help latency of single task, it helps

    Potential speedup: a CPI of 1 and fast a CC

    z Unbalanced pipe stages makes for inefficiencies

    z The time to fill i eline and time to drain it can im actspeedup for deep pipelines and short code runs

    Must detect and resolve hazards

    z

    Stalling negatively affects CPI (makes CPI less than the idealof 1)

    CSE431 Chapter 4A.43 Irwin, PSU, 2008