Upload
ngohanh
View
219
Download
1
Embed Size (px)
Citation preview
1CSE 45432 SUNY New Paltz
The ProcessorDatapath and Control
Chapter Five
Shift�left 2
PC
Memory
�
MemData
Write�data
M�u�x
0
1
RegistersWrite�register
Write�data
Read�data 1
Read�data 2
Read�register 1
Read�register 2
M�u�x
0
1
M�u�x
0
1
4
Instruction�[15– 0]
Sign�extend
3216
Instruction�[25– 21]
Instruction�[20– 16]
Instruction�[15– 0]
Instruction�register
1 M�u�x
0
32
M�u�x
ALU�result
ALUZero
Memory�data�
register
Instruction�[15– 11]
�A
B
ALUOut
0
1
Address
2CSE 45432 SUNY New Paltz
• We're ready to look at an implementation of the MIPS• Simplified to contain only:
– memory-reference instructions: lw, sw– arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq, j
• Generic Implementation:
– use the program counter (PC) to supply instruction address– get the instruction from memory– read (write) from (to) registers– the op-code determines exactly what to do
The Processor: Datapath & Control
RegistersRegister #
Data
Register #
Data�memory
Address
Data
Register #
PC Instruction ALU
Instruction�memory
Address
3CSE 45432 SUNY New Paltz
• Abstract / Simplified View:
Two types of functional units:– elements that operate on data values (combinational): ALU– elements that contain state (sequential): registers and memory
More Implementation Details
RegistersRegister #
Data
Register #
Data�memory
Address
Data
Register #
PC Instruction ALU
Instruction�memory
Address
cycle timerising edge
falling edge
4CSE 45432 SUNY New Paltz
Clocking Methodology
• A typical methodology of edge triggered execution:– read contents of some state elements, – send values through some combinational logic– write results to one or more state elements
– All storage elements are clocked by the same clock edge– Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew– (CLK-to-Q + Shortest Delay Path - Clock Skew) > Hold Time
Clock cycle
State�element�
1Combinational logic
State�element�
2
Clk
Don’t Care
Setup Hold
.
.
.
.
.
.
.
.
.
.
.
.
Setup Hold
5CSE 45432 SUNY New Paltz
• Built using D flip-flops
Register File
M�u�x
Register 0Register 1
Register n – 1Register n
M�u�x
Read data 1
Read data 2
Read register�number 1
Read register�number 2
Read register�number 1 Read�
data 1
Read�data 2
Read register�number 2
Register fileWrite�register
Write�data Write
n-to-1�decoder
Register 0
Register 1
Register n – 1C
C
D
DRegister n
C
C
D
D
Register number
Write
Register data
01
n – 1n
Figures B.18 and B.20 from the textbook
6CSE 45432 SUNY New Paltz
Components of a CPU Implementation
• Include the functional units we need for each instruction
ALU Control:000 AND001 OR010 add110 subtract111 set-on-less-than
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Data
Data
Registernumbers
a. Registers b. ALU
Zero5
5
5 3
16 32Sign�
extend
b. Sign-extension unit
MemRead
MemWrite
Data�memory
Write�data
Read�data
a. Data memory unit
Address
7CSE 45432 SUNY New Paltz
Datapath for Fetch and R-Instructions
• Portion of datapath for fetching instructions and updating PC
• R-format instructions: read two registers and write one register
PC
Instructionmemory
Readaddress
Instruction
4
Add
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALUZero
3
Instruction
ALU Control:000 AND001 OR010 add110 subtract111 set-on-less-than
8CSE 45432 SUNY New Paltz
Datapath for lw and sw insturctions
• lw $t1, offset_value($t2) or sw $t1, offset_value ($t2)– Compute memory address– Sign extend 16 bit to 32 bit
1 6 3 2
R e g is te rs
W r i te
r e g is t e r
W r i te
d a ta
R e a d
d a ta 1
R e a d
d a ta 2
R e a d
r e g is t e r 1
R e a d
r e g is t e r 2
A L U o p e ra t io n3
R e g W r it e
M e m R e a d
M e m W r i t e
A L U
r e s u lt
Z e ro
A L U
D a ta
m e m o r y
A d d r e s s
W r i te
d a ta
R e a
d a ta
S ig n
e x te n d
Instruction
9CSE 45432 SUNY New Paltz
Building beq instruction
• beq $t1, $t2, offset– Compare registers, use ALU to affect Z flag– If not equal, next PC = PC +4– If equal, sign extend the offset and shift by two
Instruction
16 32
Add ALUresult
Registers
WriteregisterWritedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Shiftleft 2
ALU operation3
RegWrite
ZeroALU
Signextend
PC + 4
Branch control
Next instruction address
10CSE 45432 SUNY New Paltz
A Simple Implementation of a Datapath• Covers: lw, sw, beq, add, sub, and, or, set-on-less-than• jump instruction: later• Use multiplexors to stitch them together
Instruction
16 32
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Mux
ALU operation3
RegWrite
MemRead
MemWriteALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
11CSE 45432 SUNY New Paltz
A Simple Implementation of a Datapath
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
WriteregisterWritedata
Readdata1
Readdata2
Readregister1Readregister2
Shiftleft 2
4
Mux
ALUoperation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrcMemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Signextend
Add
12CSE 45432 SUNY New Paltz
Control
• Using the op-code from the instruction, the control issues signals to:
– Selecting the operations to perform (ALU, read/write, etc.)
– Controlling the flow of data (multiplexer inputs)
• ALU's operation based on instruction type and function code
• Example: What should the ALU do with the instruction
add $8, $17, $18000000 10001 10010 01000 00000 100000
op rs rt rd shamt funct
• or the instruction lw $1, 100($2)
35 2 1 100
op rs rt 16 bit offset
13CSE 45432 SUNY New Paltz
• Must describe hardware to compute 3-bit ALU control input– given instruction type
00 = lw, sw01 = beq, 11 = arithmetic
– function code for arithmetic
• Describe it using a truth table (can turn into gates):
• Muli-level decoding can reduce size of control unit and increase its speed.
ALUOpcomputed from instruction type
ALU Control
ALUOp Funct field OperationALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010 add0 1 X X X X X X 110 sub1 1 (X) X X 0 0 0 0 010 add1 1 (X) X X 0 0 1 0 110 sub1 1 (X) X X 0 1 0 0 000 and1 1 (X) X X 0 1 0 1 001 or1 1 (X) X X 1 0 1 0 111 slt
14CSE 45432 SUNY New Paltz
ALU Control
ALUOp Funct field OperationALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0
0 0 X X X X X X 010 add0 1 X X X X X X 110 sub1 0 X X 0 0 0 0 010 add1 0 X X 0 0 1 0 110 sub1 0 X X 0 1 0 0 000 and1 0 X X 0 1 0 1 001 or1 0 X X 1 0 1 0 111 slt
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5–0)
ALUOp0
ALUOp
ALU control blockFigures C.1 and C.3
15CSE 45432 SUNY New Paltz
• R-Type Instruction
op rs rt rd shamt funct
rd: destination
•Load and Store Instructions
op rs rt 16 bit offset
rt: destination
•Branch Instruction
op rs rt 16 bit offset
rt: destination
Three Instruction Classes
16CSE 45432 SUNY New Paltz
The Simple Control
P C
Instru ctionme mory
Readaddress
Ins truc tion[31– 0 ]
Instruction [2 0– 16]
Instruction [2 5– 21]
A dd
In struction [5– 0]
4
16 3 2Instruction [1 5– 0]
0
0Mux
0
1
Add ALUresu lt
Mux
0
1
Regis tersW ritereg ister
W rited a ta
Readda ta 1
Readda ta 2
R ea dreg ister 1
R ea dreg ister 2
Signextend
S hiftle ft 2
Mux
1
ALUresult
Zero
Da tam em o ry
W ritedata
Readdata
Mux
1
Instruction [1 5– 11]
A LUco ntrol
A LUAd dress
RegWrite
ALUSrc
ALUop
MemRead
MemtoReg
RegDst
MemWrite
PCSrc
17CSE 45432 SUNY New Paltz
Control
Instruction Opcode RegDst ALUSrcMemto-
RegReg
WriteMem Read
Mem Write Branch ALUOp1 ALUp0
R-format 00 0000 1 0 0 1 0 0 0 1 0lw 10 0011 0 1 1 1 1 0 0 0 0sw 10 1011 X 1 X 0 0 1 0 0 0beq 00 0100 X 0 X 0 0 0 1 0 1
PC
Instruction�memory
Read�address
Instruction�[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoRegALUOpMemWrite
RegWrite
MemReadBranchRegDst
ALUSrc
Instruction [31– 26]
4
16 32Instruction [15– 0]
0
0M�u�x
0
1
Control
Add ALU�result
M�u�x
0
1
RegistersWrite�register
Write�data
Read�data 1
Read�data 2
Read�register 1
Read�register 2
Sign�extend
Shift�left 2
M�u�x
1
ALU�result
Zero
Data�memory
Write�data
Read�data
M�u�x
1
Instruction [15– 11]
ALU�control
ALUAddress
18CSE 45432 SUNY New Paltz
Control
• Simple combinational logic (See Appendix C for Details)
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5–0)
ALUOp0
ALUOp
ALUcontrol block
R-format Iw sw beq
Op0Op1Op2Op3Op4Op5
Inputs
Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOpO
Figure C.3
Figure C.5
19CSE 45432 SUNY New Paltz
• All of the logic is combinational
• We wait for everything to settle down, and the right thing to be done
– ALU might not produce “right answer” right away
– we use write signals along with clock to determine when to write
• Cycle time determined by length of the longest path
Our Simple Control Structure
Clock
Com binational logicContent of PC Content
of PC
20CSE 45432 SUNY New Paltz
Single Cycle Implementation
• Calculate cycle time assuming negligible delays except:– memory (2ns), ALU and adders (2ns), register file access (1ns)
MemtoReg
MemRead
MemWriteALUSrc
RegDst
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20–16]
Instruction [25–21]
Add
Instruction [5–0]
RegWrite
4
16 32Instruction [15–0]
0Registers
WriteregisterWritedata
Writedata
Readdata 1
Readdata 2
Readregister 1Readregister 2
Signextend
ALUresult
Zero
Datamemory
Address Readdata M
ux
1
0
Mux
1
Mux
1
0
Mux
1
Instruction [15–11]
Shiftleft 2
PCSrc
ALU
Add ALUresult
ALUOp
0
ALUcontrol
21CSE 45432 SUNY New Paltz
Single Cycle Implementation
• Calculate cycle time assuming negligible delays except:– memory (2ns), ALU and adders (2ns), register file access (1ns)
M e m to R e g
M e m R e a d
M e m W r i te
A L U O p
A L U S rc
R e g D s t
P C
I n s t ru c t io n �m e m o ry
R e a d �a d d re s s
In s tru c t io n �[3 1 – 0 ]
In s tr u c t io n [2 0 – 1 6 ]
In s tr u c t io n [2 5 – 2 1 ]
A d d
In s tru c t io n [5 – 0 ]
R e g W r i te
4
1 6 3 2In s tr u c t io n [1 5 – 0 ]
0R e g is t e rs
W r ite �re g is te rW r ite �d a ta
W r i te �d a ta
R e a d �d a ta 1
R e a d �d a ta 2
R e a d �re g is te r 1R e a d �re g is te r 2
S ig n �e x te n d
A L U �re s u l t
Z e r o
D a ta �m e m o ry
A d d re s s R e a d �d a ta M �
u �x
1
0
M �u �x
1
0
M �u �x
1
0
M �u �x
1
In s tr u c t io n [1 5 – 1 1 ]
A L U �c o n t ro l
S h i f t�le f t 2
P C S rc
A L U
A d d A L U �re s u lt
InstructionInstr.
MemoryRegister
Read ALU Op.Data
MemoryReg. Write Total
R-format 2 1 2 0 1 6 nslw 2 1 2 2 1 8 nssw 2 1 2 2 7 nsbeq 2 1 2 5 ns
22CSE 45432 SUNY New Paltz
Where we are headed
• Single Cycle Problems:– cycle time should accommodate the longest instruction. What if we had more
complicated instructions like floating point?– can use a unit only once during a cycle -- may need multiple copies of some
functional units (wasteful of area)• One Solution:
– use a “smaller” cycle time– have different instructions take different numbers of cycles– a “multi-cycle” datapath:
PC
Memory
Address
Instruction�or data
Data
Instruction�register
RegistersRegister #
Data
Register #
Register #
ALU
Memory�data �
register
A
B
ALUOut
23CSE 45432 SUNY New Paltz
• We will be reusing functional units– ALU used to compute address and the new PC value– same memory used for instruction and data
• Break up the instructions into steps, each step takes a cycle– balance the amount of work to be done– restrict each cycle to use only one major functional unit
• Our control signals will not be determined solely by instruction, but also by the current step– e.g., what should the ALU do for an “add” instruction?
• At the end of a cycle (step)– store values for use in later steps -- introduce additional “internal” registers
Multi-cycle Approach
24CSE 45432 SUNY New Paltz
• Added components– IR and MDR both needed during the same cycle– A and B registers to hold operand values– ALUOut register
Multi-cycle Approach
PC
Memory
Address
Instruction�or data
Data
Instruction�register
RegistersRegister #
Data
Register #
Register #
ALU
Memory�data �
register
A
B
ALUOut
25CSE 45432 SUNY New Paltz
Multicycle Approach
Shift�left 2
PC
Memory
�
MemData
Write�data
M�u�x
0
1
RegistersWrite�register
Write�data
Read�data 1
Read�data 2
Read�register 1
Read�register 2
M�u�x
0
1
M�u�x
0
1
4
Instruction�[15–0]
Sign�extend
3216
Instruction�[25–21]
Instruction�[20–16]
Instruction�[15–0]
Instruction�register
1 M�u�x
0
32
M�u�x
ALU�result
ALUZero
Memory�data�
register
Instruction�[15–11]
�A
B
ALUOut
0
1
Address
• Internal registers except IR are updated every clock cycle; no write control
•Need to add new MUX’s and expand existing MUX’s
26CSE 45432 SUNY New Paltz
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata1
Readdata2
Readregister1
Readregister2
Mux
0
1
Mux
0
1
4
Instruction[15–0]
Signextend
3216
Instruction[25–21]
Instruction[20–16]
Instruction[15–0]
Instructionregister
1 Mux
0
32
Mux
ALUresult
ALUZero
Memorydata
register
Instruction[15–11]
A
B
ALUOut
0
1
Address
Shiftleft2
01
2
3226
ALUcontrol
ALUOp
ALUSrcB
ALUSrcA
PCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
27CSE 45432 SUNY New Paltz
1. Instruction Fetch
2. Instruction Decode and Register Fetch
3. Execution, Memory Address Computation, or Branch Completion
4. Memory Access or R-type instruction completion
5. Write-back step
INSTRUCTIONS TAKE FROM 3 INSTRUCTIONS TAKE FROM 3 -- 5 CYCLES!5 CYCLES!
Five Execution Steps
1
2
3
4
5
28CSE 45432 SUNY New Paltz
• Use PC to get instruction and put it in the Instruction Register.• Increment the PC by 4 and put the result back in the PC.
IR = Memory[PC];PC = PC + 4;
MemReadALUSrcA =0IorD = 0IRWriteALUSrcB = 01ALUOp = 00PCWritePCSource = 00
Step 1: Instruction Fetch
1
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1RegistersWriteregister
Writedata
Readdata1Readdata2
Readregister1Readregister2
Mux
0
1
Mux
0
1
4Instruction[15–0]
Signextend3216
Instruction[25–21]Instruction[20–16]Instruction[15–0]
Instructionregister 1Mux
0
32
MuxALUresult
ALUZero
Memorydataregister
Instruction[15–11]
A
BALUOut
0
1Address
Shiftleft2012
3226
ALUcontrol
ALUOpALUSrcB
ALUSrcA
PCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
29CSE 45432 SUNY New Paltz
• Read registers rs and rt (in case we need them) and put them in A & B• Compute the branch address in case the instruction is a branch; put it in ALUOut
A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC + (sign-extend(IR[15-0]) << 2);
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
• We are Looking at the instruction and determine what to do in the next cycle• We aren't setting any control lines based on the instruction type
Step 2: Instruction Decode and Register Fetch
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1RegistersWriteregister
Writedata
Readdata1Readdata2
Readregister1Readregister2
Mux
0
1
Mux
0
1
4Instruction[15–0]
Signextend3216
Instruction[25–21]Instruction[20–16]Instruction[15–0]
Instructionregister 1Mux
0
32
MuxALUresult
ALUZero
Memorydataregister
Instruction[15–11]
A
BALUOut
0
1Address
Shiftleft2012
3226
ALUcontrol
ALUOpALUSrcB
ALUSrcAPCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
1
23
3 33
jlw or sw
beq
R-type
30CSE 45432 SUNY New Paltz
• ALU is performing one of three functions, based on instruction type• (ignore j instruction for now)
• Memory Reference: (lw or sw) R-type: Branch: beqALUOut = A + sign-extend(IR[15-0]); ALUOut = A op B; if (A==B) PC = ALUOut
(compute address) (execute) (complete branch)
ALUSrcA = 1 ALUSrcA = 1 ALUSrcA = 1ALUSrcB = 10 ALUSrcB = 00 ALUSrcB = 00ALUOp = 00 ALUOp = 10 ALUOp = 01
go to step 4 PCWriteCondPCSource = 01
lw swend of execution fetch next instruction
Step 3 (instruction dependent)
31CSE 45432 SUNY New Paltz
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata1
Readdata2
Readregister1
Readregister2
Mux
0
1
Mux
0
1
4
Instruction[15–0]
Signextend
3216
Instruction[25–21]
Instruction[20–16]
Instruction[15–0]
Instructionregister
1 Mux
0
32
Mux
ALUresult
ALUZero
Memorydata
register
Instruction[15–11]
A
B
ALUOut
0
1
Address
Shiftleft2
01
2
3226
ALUcontrol
ALUOp
ALUSrcB
ALUSrcA
PCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
For lw and sw
For beq
32CSE 45432 SUNY New Paltz
Lw sw R-type
MDR = Memory[ALUOut]; Memory[ALUOut] = B; Reg[IR[15-11]] = ALUOut;
(read from memory) (write to memory) (store result)MemRead MemWrite RegDst = 1IorD = 1 IorD = 1 REGWrite
MemtoReg = 0
end of execution end of execution fetch next instruction fetch next instruction
Step 4 (R-type or memory-access)
33CSE 45432 SUNY New Paltz
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata1
Readdata2
Readregister1
Readregister2
Mux
0
1
Mux
0
1
4
Instruction[15–0]
Signextend
3216
Instruction[25–21]
Instruction[20–16]
Instruction[15–0]
Instructionregister
1 Mux
0
32
Mux
ALUresult
ALUZero
Memorydata
register
Instruction[15–11]
A
B
ALUOut
0
1
Address
Shiftleft2
01
2
2826
ALUcontrol
ALUOp
ALUSrcB
ALUSrcA
PCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
lw sw
R-type
32
34CSE 45432 SUNY New Paltz
• Only for lw instructions to store data read from memory into a register
Reg[IR[20-16]]= MDR;
RegDest = 0RegWriteMemtoReg = 1
Step 5 - Write-back
Shiftleft2
PC
MemoryMemData
Writedata
Mux
0
1RegistersWriteregister
Writedata
Readdata1Readdata2
Readregister1Readregister2
Mux
0
1
Mux
0
1
4Instruction[15–0]
Signextend3216
Instruction[25–21]Instruction[20–16]Instruction[15–0]
Instructionregister 1Mux
0
32
MuxALUresult
ALUZero
Memorydataregister
Instruction[15–11]
A
BALUOut
0
1Address
Shiftleft2012
3226
ALUcontrol
ALUOpALUSrcB
ALUSrcAPCSource
MemtoReg
REGWriteIRWriteMemWriteMemRead
IorD
PCWrite
REGDst
PCWriteCondZero (ALU)
35CSE 45432 SUNY New Paltz
Summary:
Step nameAction for R-type
instructionsAction for memory-reference
instructionsAction for branches
Action for jumps
Instruction fetch IR = Memory[PC]PC = PC + 4
Instruction A = Reg [IR[25-21]]decode/register fetch B = Reg [IR[20-16]]
ALUOut = PC + (sign-extend (IR[15-0]) << 2)Execution, address ALUOut = A op B ALUOut = A + sign-extend if (A ==B) then PC = PC [31-28] IIcomputation, branch/ (IR[15-0]) PC = ALUOut (IR[25-0]<<2)jump completionMemory access or R-type Reg [IR[15-11]] = Load: MDR = Memory[ALUOut]completion ALUOut or
Store: Memory [ALUOut] = B
Memory read completion Load: Reg[IR[20-16]] = MDR
• How many state bits will we need?
Graphical Specification of FSM
PCWritePCSource = 10
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond
PCSource = 01
ALUSrcA =1ALUSrcB = 00ALUOp= 10
RegDst = 1RegWrite
MemtoReg = 0MemWriteIorD = 1
MemReadIorD = 1
ALUSrcA = 1ALUSrcB = 10ALUOp = 00
RegDst =0RegWrite
MemtoReg=1
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
Instruction fetchInstruction decode/
register fetch
Jumpcompletion
BranchcompletionExecution
Memory addresscomputation
Memoryaccess
Memoryaccess R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op = 'B
EQ')
(Op
='J
' )
(Op = 'SW')
(Op
='L
W' )
4
01
9862
753
Start
37CSE 45432 SUNY New Paltz
• How many cycles will it take to execute this code?
lw $t2, 0($t3)lw $t3, 4($t3)beq $t2, $t3, Label #assume notadd $t5, $t2, $t3sw $t5, 8($t3)
Label: ...
• What is going on during the 8th cycle of execution?• In what cycle does the actual addition of $t2 and $t3 takes place?
Simple Questions
38CSE 45432 SUNY New Paltz
• Implementation:
Finite State Machine for Control
PCWrite
PCWriteCondIorD
MemtoRegPCS ourceALUOpALUSrcBALUSrcARegWriteRegDs t
NS3NS2NS1NS0
Op5
Op4
Op3
Op2
Op1
Op0
S3 S2 S1 S0
S ta te re gis te r
IRWrite
MemReadMemWrite
Ins truction re gis teropcode field
Outputs
Control logic
Inputs
See appendix C for details
39CSE 45432 SUNY New Paltz
PLA ImplementationOp5
Op4
Op3
Op2
Op1
Op0
S3
S2
S1
S0
IorD
IRWrite
MemReadMemWrite
PCWritePCWriteCond
MemtoRegPCSource1
ALUOp1
ALUSrcB0ALUSrcARegWriteRegDstNS3NS2NS1NS0
ALUSrcB1ALUOp0
PCSource0
See appendix C for details
D Q
D Q
Op
code
40CSE 45432 SUNY New Paltz
Microprogramming
• Control is the hard part of processor design° Datapath is fairly regular and well-organized° Memory is highly regular° Control is irregular and global
Microprogramming:
-- A Particular Strategy for Implementing the Control Unit of aprocessor by "programming" at the level of register transferoperations
Microarchitecture:
-- Logical structure and functional capabilities of the hardware asseen by the microprogrammer
Historical Note:
IBM 360 Series first to distinguish between architecture & organizationSame instruction set across wide range of implementations, each withdifferent cost/performance
41CSE 45432 SUNY New Paltz
Macroinstruction VS Microinstruction
MainMemory
executionunit
controlmemory
CPU
ADDSUBAND
DATA
.
.
.
User program plus Data
this can change!
AND microsequence
e.g., FetchFetch Operand(s)CalculateSave Answer(s)
one of these ismapped into oneof these
42CSE 45432 SUNY New Paltz
Controller Design• The state diagrams that arise define the controller for an instruction set
processor are highly structured• Use this structure to construct a simple “microsequencer” • Control reduces to programming this very simple device
– microprogramming
sequencercontrol
datapath control
micro-PCsequencer
microinstruction
43CSE 45432 SUNY New Paltz
Microprogramming Implementation of the Control
P C WriteP C Write C on dIo rD
Me m to R e gP C S o urceALUOpALUS rcBALUS rcAR e g Write
Ad drC tl
O utpu ts
Microcod e m e m ory
IR W rite
Me m R e a dMe m Write
R e g Ds t
C o n trol u nit
Inp ut
Mic ro pro gra m c ou nte r
Ad dre s s s e le ct lo gic
Op[
5–0]
Add e r
1
Da ta pa th
Ins tru c tio n re g is te rop co de fie ld
BWrite
44CSE 45432 SUNY New Paltz
• microinstruction: low level control instruction which defines a set of datapathcontrol signal.
• A specification methodology– appropriate if hundreds of opcodes, modes, cycles, etc.– signals specified symbolically using microinstructions
Microprogramming
LabelALU
control SRC1 SRC2Register control Memory
PCWrite control Sequencing
Fetch Add PC 4 Read PC ALU SeqAdd PC Extshft Read Dispatch 1
Mem1 Add A Extend Dispatch 2LW2 Read ALU Seq
Write MDR FetchSW2 Write ALU FetchRformat1 Func code A B Seq
Write ALU FetchBEQ1 Subt A B ALUOut-cond FetchJUMP1 Jump address Fetch
Microinstruction formatField name Value Signals active Comment
Add ALUOp = 00 Cause the ALU to add.ALU control Subt ALUOp = 01 Cause the ALU to subtract; this implements the compare for
branches.Func code ALUOp = 10 Use the instruction's function code to determine ALU control.
SRC1 PC ALUSrcA = 0 Use the PC as the first ALU input.A ALUSrcA = 1 Register A is the first ALU input.B ALUSrcB = 00 Register B is the second ALU input.
SRC2 4 ALUSrcB = 01 Use 4 as the second ALU input.Extend ALUSrcB = 10 Use output of the sign extension unit as the second ALU input.Extshft ALUSrcB = 11 Use the output of the shift-by-two unit as the second ALU input.Read Read two registers using the rs and rt fields of the IR as the register
numbers and putting the data into registers A and B.Write ALU RegWrite, Write a register using the rd field of the IR as the register number and
Register RegDst = 1, the contents of the ALUOut as the data.control MemtoReg = 0
Write MDR RegWrite, Write a register using the rt field of the IR as the register number andRegDst = 0, the contents of the MDR as the data.MemtoReg = 1
Read PC MemRead, Read memory using the PC as address; write result into IR (and lorD = 0 the MDR).
Memory Read ALU MemRead, Read memory using the ALUOut as address; write result into MDR.lorD = 1
Write ALU MemWrite, Write memory using the ALUOut as address, contents of B as thelorD = 1 data.
ALU PCSource = 00 Write the output of the ALU into the PC.PCWrite
PC write control ALUOut-cond PCSource = 01, If the Zero output of the ALU is active, write the PC with the contentsPCWriteCond of the register ALUOut.
jump address PCSource = 10, Write the PC with the jump address from the instruction.PCWrite
Seq AddrCtl = 11 Choose the next microinstruction sequentially.Sequencing Fetch AddrCtl = 00 Go to the first microinstruction to begin a new instruction.
Dispatch 1 AddrCtl = 01 Dispatch using the ROM 1.Dispatch 2 AddrCtl = 10 Dispatch using the ROM 2.
46CSE 45432 SUNY New Paltz
Overview of Control• Control may be designed using one of several initial representations. The choice of
sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with one of several methods using a structured logic technique.
Initial Representation Finite State Diagram Microprogram
Sequencing Control Explicit Next State Microprogram counterFunction + Dispatch ROMs
Logic Representation Logic Equations Truth Tables
Implementation PLA ROM Technique
“hardwired control” “microprogrammed control”
47CSE 45432 SUNY New Paltz
Designing a Microinstruction Set
1) Start with list of control signals
2) Group signals together that make sense (vs. random): called “fields”
3) Places fields in some logical order (e.g., ALU operation & ALU operands first and microinstruction sequencing last)
4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set the control signals
– Use computers to design computers
5) To minimize the width, encode operations that will never be used at the same time
48CSE 45432 SUNY New Paltz
Legacy Software and Microprogramming
• IBM bet company on 360 Instruction Set Architecture (ISA): single instruction set for many classes of machines
– (8-bit to 64-bit)
• If microprogramming could easily do same instruction set on many different microarchitectures, then why couldn’t multiple microprograms do multiple instruction sets on the same microarchitecture?
• Coined term “emulation”: instruction set interpreter in microcode for non-native instruction set
• Very successful: in early years of IBM 360 it was hard to know whether old instruction set or new instruction set was more frequently used
49CSE 45432 SUNY New Paltz
Microprogramming Pros and Cons
• Ease of design• Flexibility
– Easy to adapt to changes in organization, timing, technology– Can make changes late in design cycle, or even in the field
• Can implement very powerful instruction sets (just more control memory)
• Generality– Can implement multiple instruction sets on same machine.– Can tailor instruction set to application.
• Compatibility– Many organizations, same instruction set
• Costly to implement• Slow
50CSE 45432 SUNY New Paltz
Multiple Cycle Datapath
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr32
AL
U
3232
ALUOp
ALUControl
InstructionR
eg
32
IRWr
32
Reg File
Ra
Rw
busW
Rb5
5
32busA
32busB
RegWr
Rs
Rt
Mux
0
1
Rt
Rd
PCWr
ALUSelA
Mux 01
RegDst
Mux
0
1
32
PC
MemtoReg
Extend
ExtOp
Mux
0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mux
1
0
Target32
Zero
ZeroPCWrCond PCSrc BrWr
32
IorD
AL
U O
ut