Upload
jonas-weaver
View
232
Download
0
Embed Size (px)
Citation preview
순천향대학교 정보기술공학부 이 상 정 1
5. The Processor: Datapath and Control
순천향대학교 정보기술공학부 이 상 정 2
Computer Architecture
5. The Processor: Datapath and Control
We're ready to look at an implementation of the MIPS
Simplified to contain only:• memory-reference instructions: lw, sw • arithmetic-logical instructions: add, sub, and, or, slt• control flow instructions: beq, j
Generic Implementation:• use the program counter (PC) to supply instruction address• get the instruction from memory• read registers• use the instruction to decide exactly what to do
All instructions use the ALU after reading the registersWhy? memory-reference? arithmetic? control
flow?
The Processor: Datapath & Control
순천향대학교 정보기술공학부 이 상 정 3
Computer Architecture
5. The Processor: Datapath and Control
Abstract / Simplified View:
Two types of functional units:• elements that operate on data values (combinational)• elements that contain state (sequential)
More Implementation Details
순천향대학교 정보기술공학부 이 상 정 4
Computer Architecture
5. The Processor: Datapath and Control
Unclocked vs. Clocked Clocks used in synchronous logic
• when should an element that contains state be updated?
State Elements
Clock period Rising edge
Falling edge
cycle time
순천향대학교 정보기술공학부 이 상 정 5
Computer Architecture
5. The Processor: Datapath and Control
The set-reset latch• output depends on present inputs and also on past inputs
An unclocked state element
R
S
Q
Q
순천향대학교 정보기술공학부 이 상 정 6
Computer Architecture
5. The Processor: Datapath and Control
Output is equal to the stored value inside the element
(don't need to ask for permission to look at the value)
Change of state (value) is based on the clock Latches: whenever the inputs change, and the clock
is asserted Flip-flop: state changes only on a clock edge
(edge-triggered methodology)"logically true", — could mean electrically low
A clocking methodology defines when signals can be read and written— wouldn't want to read a signal at the same time it was being written
Latches and Flip-flops
순천향대학교 정보기술공학부 이 상 정 7
Computer Architecture
5. The Processor: Datapath and Control
Two inputs:• the data value to be stored (D)• the clock signal (C) indicating when to read & store D
Two outputs:• the value of the internal state (Q) and it's complement
D-latch
Q
C
D
_Q
D
C
Q
순천향대학교 정보기술공학부 이 상 정 8
Computer Architecture
5. The Processor: Datapath and Control
D flip-flop
Output changes only on the clock edge
D
C
Q
D
C
Dlatch
D
C
QD
latch
D
C
Q Q
순천향대학교 정보기술공학부 이 상 정 9
Computer Architecture
5. The Processor: Datapath and Control
Our Implementation
An edge triggered methodology Typical execution:
• read contents of some state elements, • send values through some combinational logic• write results to one or more state elements
Stateelement
1
Stateelement
2Combinational logic
Clock cycle
순천향대학교 정보기술공학부 이 상 정 10
Computer Architecture
5. The Processor: Datapath and Control
Built using D flip-flops
Register File
Read registernumber 1 Read
data 1Read registernumber 2
Readdata 2
Writeregister
WriteWritedata
Register file
Read registernumber 1
Register 0
Register 1
. . .
Register n – 2
Register n – 1
M
u
x
Read registernumber 2
M
u
x
Read data 1
Read data 2
Do you understand? What is the “Mux” above?
순천향대학교 정보기술공학부 이 상 정 11
Computer Architecture
5. The Processor: Datapath and Control
Abstraction
Make sure you understand the abstractions! Sometimes it is easy to think you do, when you
don’t
Mux
C
Select
32
32
32
B
A
Mux
Select
B31
A31
C31
Mux
B30
A30
C30
Mux
B0
A0
C0
...
...
순천향대학교 정보기술공학부 이 상 정 12
Computer Architecture
5. The Processor: Datapath and Control
Register File
Note: we still use the real clock to determine when to write
Write
01
n-to-2n
decoder
n – 1
n
Register 0
C
D
Register 1
C
D
Register n – 2
C
D
Register n – 1
C
D
...
Register number...
Register data
순천향대학교 정보기술공학부 이 상 정 13
Computer Architecture
5. The Processor: Datapath and Control
Elements for Instruction Fetch
Basic Instruction Fetch Unit
순천향대학교 정보기술공학부 이 상 정 14
Computer Architecture
5. The Processor: Datapath and Control
Elements for R-format Instructions /Load/Store Instuctions
Two elements for R-format• R[rd] <- R[rs] op R[rt]
Additional elements for Load/Store• R[rt] <- Mem[R[rs] + SignExt[imm16]]
순천향대학교 정보기술공학부 이 상 정 15
Computer Architecture
5. The Processor: Datapath and Control
Datapath for a Branch
If (R[rs] == R[rt]) PC <- PC + 4 + ( SignExt(imm16) x 4 ) Else PC <- PC + 4
순천향대학교 정보기술공학부 이 상 정 16
Computer Architecture
5. The Processor: Datapath and Control
Datapath for Lw/Sw and R-type
순천향대학교 정보기술공학부 이 상 정 17
Computer Architecture
5. The Processor: Datapath and Control
Putting it Altogether
순천향대학교 정보기술공학부 이 상 정 18
Computer Architecture
5. The Processor: Datapath and Control
Control
Selecting the operations to perform (ALU, read/write, etc.)
Controlling the flow of data (multiplexor inputs)
Information comes from the 32 bits of the instruction
Example:
add $8, $17, $18
000000 10001 10010 01000 00000 100000
op rs rt rd shamt funct
ALU's operation based on instruction type and function code
순천향대학교 정보기술공학부 이 상 정 19
Computer Architecture
5. The Processor: Datapath and Control
Three Instruction Classes
op. 6 bits and funct. 6 bits can be used to generate control signals I-type an J-type instructions only have op. 6 bits R-type instructions have both op. 6 bits and funct. 6 bits
op(6) rs(5) rt(5) rd(5) shamt(5)func(6)
op(6) rs(5) rt(5) 16 bit address
op(6) 26 bit address
R
I
J
순천향대학교 정보기술공학부 이 상 정 20
Computer Architecture
5. The Processor: Datapath and Control
e.g., what should the ALU do with this instruction Example: lw $1, 100($2)
35 2 1 100
op rs rt 16 bit offset
ALU control input
0000 AND0001 OR0010 add0110 subtract0111 set-on-less-than1100 NOR
ALU Control
순천향대학교 정보기술공학부 이 상 정 21
Computer Architecture
5. The Processor: Datapath and Control
Control with Local Decoding
SmallControl 1
(fast)
op
6
SmallControl 2
(fast)
func
2
6ALUop
ALUctr
4
Big Single Control Logic(becomes slow)
op
6
func
6
ALUctr
MemRead
RegWrite
MemtoReg
ALUSrc
MemWrite
RegDst
Branch
4
MemRead
RegWriteMemtoReg
ALUSrcMemWrite
RegDstBranch
Main Control
ALU Control
순천향대학교 정보기술공학부 이 상 정 22
Computer Architecture
5. The Processor: Datapath and Control
ALU Control (ALUctr)
InstructionOpcode
ALUOp Instructionoperation
Functionfield
DesiredALU action
ALU control input
LW 00 Load word xxxxxx Add 0010
SW 00 Store word xxxxxx Add 0010
Branch equal
01 Branch equal xxxxxx Subtract 0110
R-type 10 Add 100000 Add 0010
R-type 10 Subtract 100010 Subtract 0110
R-type 10 AND 100100 And 0000
R-type 10 OR 100101 Or 0001
R-type 10 Set on less than
101010 Set on less than
0111
순천향대학교 정보기술공학부 이 상 정 23
Computer Architecture
5. The Processor: Datapath and Control
Can turn into gates:
Truth Table for ALU Control
순천향대학교 정보기술공학부 이 상 정 24
Computer Architecture
5. The Processor: Datapath and Control
Control for R-type Instruction
Data path and active control signals are highlighted
순천향대학교 정보기술공학부 이 상 정 25
Computer Architecture
5. The Processor: Datapath and Control
Control for “load” instruction
순천향대학교 정보기술공학부 이 상 정 26
Computer Architecture
5. The Processor: Datapath and Control
Control for “branch equal”
순천향대학교 정보기술공학부 이 상 정 27
Computer Architecture
5. The Processor: Datapath and Control
Instruction RegDst ALUSrcMemto-
RegReg
WriteMemRead
MemWrite Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0lw 0 1 1 1 1 0 0 0 0sw X 1 X 0 0 1 0 0 0beq X 0 X 0 0 0 1 0 1
Readregister 1
Readregister 2
Writeregister
Writedata
Writedata
Registers
ALU
Add
Zero
Readdata 1
Readdata 2
Signextend
16 32
Instruction[31–0] ALU
result
Add
ALUresult
Mux
Mux
Mux
Address
Datamemory
Readdata
Shiftleft 2
4
Readaddress
Instructionmemory
PC
1
0
0
1
0
1
Mux
0
1
ALUcontrol
Instruction [5–0]
Instruction [25–21]
Instruction [31–26]
Instruction [15–11]
Instruction [20–16]
Instruction [15–0]
RegDstBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite
Control
순천향대학교 정보기술공학부 이 상 정 28
Computer Architecture
5. The Processor: Datapath and Control
Truth Table for Main Control
Input or output Signal name R-format lw sw beq
Inputs Op5 0 1 1 0
Op4 0 0 0 0
Op3 0 0 1 0
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
Outputs RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
RegWrite 1 1 0 0
MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOp0 0 0 0 1
순천향대학교 정보기술공학부 이 상 정 29
Computer Architecture
5. The Processor: Datapath and Control
Control
Simple combinational logic (truth tables)
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5– 0)
ALUOp0
ALUOp
ALU control block
R-format Iw sw beq
Op0
Op1
Op2
Op3
Op4
Op5
Inputs
Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOpO
순천향대학교 정보기술공학부 이 상 정 30
Computer Architecture
5. The Processor: Datapath and Control
Implementing Jump
op Exit (26 bit address)
xxxx 00
j Exit
$pc
jump addr. Exit (26 bit address)
xxxx
순천향대학교 정보기술공학부 이 상 정 31
Computer Architecture
5. The Processor: Datapath and Control
All of the logic is combinational
We wait for everything to settle down, and the right thing to be done• ALU might not produce “right answer” right away
• we use write signals along with clock to determine when to write
Cycle time determined by length of the longest path
Our Simple Control Structure
We are ignoring some details like setup and hold times
Stateelement
1
Stateelement
2Combinational logic
Clock cycle
순천향대학교 정보기술공학부 이 상 정 32
Computer Architecture
5. The Processor: Datapath and Control
Performance of Single Cycle Datapath
Calculate cycle time assuming negligible delays except:• memory (200ps),
ALU and adders (100ps), register file access (50ps)
Minimum Cycle Time > 600ps!Instruction class Functional units used by the instruction class Required Time
R-type Instruction fetch
Register access
ALU Register access
400ps
lw Instruction fetch
Register access
ALU Memory access
Register access
600ps
sw Instruction fetch
Register access
ALU Memory access
550ps
beq Instruction fetch
Register access
ALU 350ps
순천향대학교 정보기술공학부 이 상 정 33
Computer Architecture
5. The Processor: Datapath and Control
Where we are headed
Single Cycle Problems:• what if we had a more complicated instruction like floating
point?• wasteful of area
One Solution:• use a “smaller” cycle time• have different instructions take different numbers of cycles• a “multicycle” datapath:
Data
Register #
Register #
Register #
PC Address
Instructionor dataMemory Registers ALU
Instructionregister
Memorydata
register
ALUOut
A
BData
순천향대학교 정보기술공학부 이 상 정 34
Computer Architecture
5. The Processor: Datapath and Control
What’s wrong with our CPI=1 processor?
Long Cycle Time All instructions take as much time as the slowest Real memory is not so nice as our idealized memory
• cannot always get the job done in one (short) cycle
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
순천향대학교 정보기술공학부 이 상 정 35
Computer Architecture
5. The Processor: Datapath and Control
Reducing Cycle Time
Cut combinational dependency graph and insert register / latch
Do same work in two fast cycles, rather than one slow one storage element
CombinationalLogic (A)
storage element
storage element
CombinationalLogic (B)
storage element
CombinationalLogic
storage element
=>
순천향대학교 정보기술공학부 이 상 정 36
Computer Architecture
5. The Processor: Datapath and Control
Partitioning the CPI=1 Datapath
Add registers between smallest stepsP
C
Nex
t P
C
Ope
rand
Fet
ch Exec Reg
. F
ile
Mem
Acc
ess
Inst
ruct
ion
Fet
ch
AL
Uct
r
Reg
Dst
AL
USr
c
Ext
Op
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
순천향대학교 정보기술공학부 이 상 정 37
Computer Architecture
5. The Processor: Datapath and Control
Example Multicycle Datapath
PC
Nex
t P
C
Ope
rand
Fet
ch
Ext
ALU Reg
. F
ile
Mem
Acc
es s
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
USr
c
Ext
Op
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
IRA
B
S
M
RegFile
Mem
ToR
eg
순천향대학교 정보기술공학부 이 상 정 38
Computer Architecture
5. The Processor: Datapath and Control
R-rtype (add, sub, . . .)
Logical Register Transfer Physical Register Transfers
inst Logical Register Transfers
ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
ADD A<– R[rs]; B <– R[rt]
S <– A + B
R[rd] <– S; PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
A
B
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em
순천향대학교 정보기술공학부 이 상 정 39
Computer Architecture
5. The Processor: Datapath and Control
Load
Logical Register Transfer
inst Logical Register Transfers
LW R[rt] <– MEM(R[rs] + sx(Im16);
PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
LW A<– R[rs]; B <– R[rt]
S <– A + SignEx(Im16)
M <– MEM[S]
R[rt] <– M; PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
es s
A
B
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em
Physical Register Transfers
순천향대학교 정보기술공학부 이 상 정 40
Computer Architecture
5. The Processor: Datapath and Control
Store
Logical Register Transfer
inst Logical Register Transfers
SW MEM(R[rs] + sx(Im16) <– R[rt];
PC <– PC + 4
inst Physical Register Transfers
IR <– MEM[pc]
SW A<– R[rs]; B <– R[rt]
S <– A + SignEx(Im16);
MEM[S] <– B PC <– PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
ess
A
B
S
M
Reg
File
PC
Nex
t P
C
IR
Inst
. M
em
Physical Register Transfers
순천향대학교 정보기술공학부 이 상 정 41
Computer Architecture
5. The Processor: Datapath and Control
Branch
Logical Register Transfer
inst Logical Register Transfers
BEQ if R[rs] == R[rt]
then PC <= PC + 4 + sx(Im16) || 00
else PC <= PC + 4
Exe
c
Reg
. F
ile
Mem
Acc
es s
A
B
S
M
Reg
File
Equ
al
PC
Nex
t P
C
IR
Inst
. M
eminst Physical Register Transfers
IR <– MEM[pc]
BEQ A<– R[rs]; B <– R[rt]
S <– A - B
Eq:PC<-PC+4+sx(Im16)||00, ~Eq:PC <– PC + 4
Physical Register Transfers
순천향대학교 정보기술공학부 이 상 정 42
Computer Architecture
5. The Processor: Datapath and Control
Summary:
순천향대학교 정보기술공학부 이 상 정 43
Computer Architecture
5. The Processor: Datapath and Control
Multiple Cycle Datapath
Miminizes Hardware: 1 memory, 1 adder
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr
32
AL
U
3232
ALUOp
ALUControl
Instru
ction R
eg
32
IRWr
32
Reg File
Ra
Rw
busW
Rb5
5 busA
busB
RegWr
Rs
Rt
Mux
0
1Rd
PCWr
ALUSelA
Mux 01
RegDst
Mux
0
1
32
PC
MemtoReg
Extend
ExtOp
Mux
0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mux
1
0
32
Zero
ZeroPCWrCond PCSrc
32
IorD
S
A
B5
M
순천향대학교 정보기술공학부 이 상 정 44
Computer Architecture
5. The Processor: Datapath and Control
Multi-Cycle Datapath & Control Signals
Figure 5.28 in page 323
Shiftleft 2
PCMux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15– 11]
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
Control
Outputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
1 Mux
0
3
2
Mux
0
1ALUOut
Memory
MemData
Writedata
Address
순천향대학교 정보기술공학부 이 상 정 45
Computer Architecture
5. The Processor: Datapath and Control
High-Level Control Flow
Instruction fetch / decode and register fetchInstruction fetch / decode and register fetch
Memory accessMemory accessinstructionsinstructions
R-type instructionsR-type instructions Branch instructionsBranch instructions Jump instructionJump instruction
start
• Common 2-clock sequence to fetch/decode any instruction• Separate sequences of 1 to 3 clocks to execute specific types of instruction• Two techniques for control
• finite state machine• microprogramming
순천향대학교 정보기술공학부 이 상 정 46
Computer Architecture
5. The Processor: Datapath and Control
Review of Finite State Machine(FSM)
Finite state machine• a set of states and• next state function (determined by current state and input)• output function (determined by current state and input)
We will use a Moore machine • output based only on current state• cf. Mealy machine: output based on both current state and input
Current stateCurrent state Next-stateNext-statefunctionfunction
OutputOutputfunctionfunction
Inputs
Outputs
clock
순천향대학교 정보기술공학부 이 상 정 47
Computer Architecture
5. The Processor: Datapath and Control
Control Finite State Machine(FSM) Diagram
PCWritePCSource = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond
PCSource = 01
ALUSrcA =1ALUSrcB = 00ALUOp= 10
RegDst = 1RegWrite
MemtoReg = 0
MemWriteIorD = 1
MemReadIorD = 1
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst = 0RegWrite
MemtoReg =1
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
Instruction fetchInstruction decode/
register fetch
Jumpcompletion
BranchcompletionExecution
Memory addresscomputation
Memoryaccess
Memoryaccess R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op
= 'B
EQ')
(Op
= 'J
')
(Op = 'SW
')
(Op
= 'L
W')
4
01
9862
753
Start
순천향대학교 정보기술공학부 이 상 정 48
Computer Architecture
5. The Processor: Datapath and Control
Control Finite State Machine Structure
Outputs to datapath control determined by current state Next state determined by current state and input from instruction register
CombinationalCombinationalControl LogicControl Logic
(random logic or PLA)(random logic or PLA)
State registerState register
inputs
outputs
Next state
Datapath control outputs
inputs from instructionregister opcode field
순천향대학교 정보기술공학부 이 상 정 49
Computer Architecture
5. The Processor: Datapath and Control
PLA Implementation
Op5
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
IorD
IRWrite
MemReadMemWrite
PCWritePCWriteCond
MemtoRegPCSource1
ALUOp1
ALUSrcB0ALUSrcARegWriteRegDstNS3NS2NS1NS0
ALUSrcB1ALUOp0
PCSource0
OP code
current state
ANDplane
ORplane
순천향대학교 정보기술공학부 이 상 정 50
Computer Architecture
5. The Processor: Datapath and Control
Microprogramming
Microprogramming is an alternate structure for implementing the control finite state machine• represent outputs and next state selection as microinstructions in a memory (the c
ontrol store) addressed by the current state (often called the microprogram counter)
• the control store can be implemented in ROM or RAM (writable control store)• provides level of abstraction between software and datapath hardware: firmware
순천향대학교 정보기술공학부 이 상 정 51
Computer Architecture
5. The Processor: Datapath and Control
“Macroprogram” Interpretation
MainMemory
execution
unit
controlmemory
CPU
User program plus Data
AND microsequence
e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s)
one of these ismapped into oneof these
ANDSUBADD…
DATA
…
순천향대학교 정보기술공학부 이 상 정 52
Computer Architecture
5. The Processor: Datapath and Control
Exceptions
Normal Control Flow• sequential, jumps, branches, calls, returns
Exception = unprogrammed control transfer• system takes action to handle the exception
• must record the address of the offending instruction• returns control to user• must save & restore user state
user programSystemExceptionHandlerException:
return fromexception
순천향대학교 정보기술공학부 이 상 정 53
Computer Architecture
5. The Processor: Datapath and Control
Exceptions vs. Interrupts
MIPS convention:• exception means any unexpected change in control flow, without
distinguishing internal or external
• use the term interrupt only when the event is externally caused.
Type of event From where? MIPS terminologyI/O device request External InterruptInvoke OS from user program Internal ExceptionArithmetic overflow Internal ExceptionUsing an undefined instruction Internal ExceptionHardware malfunctions Either Exception or Interrupt
순천향대학교 정보기술공학부 이 상 정 54
Computer Architecture
5. The Processor: Datapath and Control
Exception Handling in MIPS
Save PC in the dedicated register EPC• like $ra but can’t assume that $ra is unused
Records the reason for the exception in the Cause register• used by operating system to figure out what went wrong
Load special value into PC, the exception handler’s address• transfer execution to exception handler
The instruction RFE restores the PC from EPC• like jr $ra for subroutines but a special register just for exceptions
순천향대학교 정보기술공학부 이 상 정 55
Computer Architecture
5. The Processor: Datapath and Control
Updated Datapath to Implement Exceptions
Shiftleft 2
Memory
MemData
Writedata
Mux
0
1
Instruction[15– 11]
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
Control
Outputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
Address
EPC
CO 00 00 00 3
Cause
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
EPCWriteIntCauseCauseWrite
1
0
1 Mux
0
3
2
Mux
0
1
Mux
0
1
PC
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
ALUOut
순천향대학교 정보기술공학부 이 상 정 56
Computer Architecture
5. The Processor: Datapath and Control
Updated Finite State Machine
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond
PCSource = 01
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
MemWriteIorD = 1
MemReadIorD = 1
ALUSrcA = 1ALUSrcB = 00ALUOp = 00
RegWriteMemtoReg = 1
RegDst = 0
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
Instruction fetchInstruction decode/
Register fetch
Jumpcompletion
BranchcompletionExecution
Memory addresscomputation
Memoryaccess
Memoryaccess R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op
= 'B
EQ')
(Op
= 'J
')
(Op = 'SW
')
(Op
= 'L
W')
4
01
9862
7 11 1053
Start
(Op = other)
Overflow
Overflow
ALUSrcA = 0ALUSrcB = 01ALUOp = 01
EPCWritePCWrite
PCSource = 11
IntCause = 0CauseWrite
ALUSrcA = 0ALUSrcB = 01ALUOp = 01
EPCWritePCWrite
PCSource = 11
IntCause = 1CauseWrite
PCWritePCSource = 10
순천향대학교 정보기술공학부 이 상 정 57
Computer Architecture
5. The Processor: Datapath and Control
Pentium 4
Pipelining is important (last IA-32 without it was 80386 in 1985)
Pipelining is used for the simple instructions favored by compilers
“Simply put, a high performance implementation needs to ensure that the simple instructions execute quickly, and that the burden of the complexities of the instruction set penalize the complex, less frequently used, instructions”
Control
Control
Control
Enhancedfloating pointand multimedia
Control
I/Ointerface
Instruction cache
Integerdatapath
Datacache
Secondarycacheandmemoryinterface
Advanced pipelininghyperthreading support
Chapter 6
Chapter 7
순천향대학교 정보기술공학부 이 상 정 58
Computer Architecture
5. The Processor: Datapath and Control
Pentium 4
Somewhere in all that “control we must handle complex instructions
Processor executes simple microinstructions, 70 bits wide (hardwired) 120 control lines for integer datapath (400 for floating point) If an instruction requires more than 4 microinstructions to implement,
control from microcode ROM (8000 microinstructions) Its complicated!
Control
Control
Control
Enhancedfloating pointand multimedia
Control
I/Ointerface
Instruction cache
Integerdatapath
Datacache
Secondarycacheandmemoryinterface
Advanced pipelininghyperthreading support