Upload
mala-aarthy
View
217
Download
0
Embed Size (px)
Citation preview
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 1/7
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 2/7
Copyright M. Baltrush (CS252-8) 4
Data Hazard
F1 D1E1 W1
F2 D2 E2 W2
F3 D3E3 W3
F4 D4E4 W4
1 2 3 4 5 6 7 8 9 clock time
Execute requires more
than one clock cycle
Stall (Bubble)
Copyright M. Baltrush (CS252-8) 5
Instruction (Control) Hazard
F1 D1E1 W1
F2 D2 E2 W2
F3 D3E3 W3
1 2 3 4 5 6 7 8 9 clock time
Stall
(Bubble)
Instruction is not
in cache.
Decode idle: 3, 4, 5
Execute idle: 4, 5, 6Write idle: 5, 6, 7
Copyright M. Baltrush (CS252-8) 6
• Aß
3 + ABß 4 x A
• Sequentially = 32
• Without regard to data hazard = 20
Data Hazard
• Data is not available when needed
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 3/7
Copyright M. Baltrush (CS252-8) 7
Source is previous Destination
F1 D1 E1 W1
F2 D2E2 W2
F3 D3 E3 W3
Stall
(Bubble)
D2a
1 2 3 4 5 6 7 8 9 clock time
MulR2, R3, R4 Add R5, R4, R6
Copyright M. Baltrush (CS252-8) 8
Operand Forwarding
• Send data from source operation to destination
operation without passing through register file.
E:execute(ALU)
W:write(Register
File)
Src1,Src2 Result
Forwarding Path
Copyright M. Baltrush (CS252-8) 9
Software Solution
• Mul R2, R3, R4
• Add R5, R4, R6
• Mul R2, R3, R4
• NOP
• NOP
• Add R5, R4, R6Compiler at work
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 4/7
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 5/7
Copyright M. Baltrush (CS252-8) 13
Instruction Queue Pre-fetch
• Queue instructions for dispatch
• Fetch unit recognizes branch and obtains
instruction – branch folding
• Masks cache misses
Copyright M. Baltrush (CS252-8) 14
Hardware Change
Fetch
DispatchDecode
Execute Write
Instruction Queue
Copyright M. Baltrush (CS252-8) 15
Conditional Branch Delayed Branch
• loop shift_left R1
decrement R2 branch = 0
nex t add R1, R3
• loop decrement R2
branch = 0shift_left R1
next add R1, R3
Original code Rearranged by compiler
Assumes a pipelined architecture
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 6/7
Copyright M. Baltrush (CS252-8) 16
Conditional Branch
Static Branch Prediction
• Speculative execution – hardware assumes
branch not taken/taken
• Conditional branches not random
• Compiler sets/resets branch prediction bit in
instruction
Copyright M. Baltrush (CS252-8) 17
Conditional Branch
Dynamic Branch Prediction
SNT LNT
LT ST
BT
BT
BNT
BNT
BNT
BNT
BT
BT
ST: Strongly Likely TakenLT: Likely Taken
LNT: Likely Not TakeSNT: Strongly Not Taken
Copyright M. Baltrush (CS252-8) 18
Instruction Set Influence: Addressing
• Load (X(R1)), R2 • Add #X, R1, R2
Load (R2), R2
Load (R2), R2
Both require 7 cycles to finish execution
Requires 3 memory accessesto obtain operand – stalls pipeline
Fewer instructions required
7/28/2019 cs252-8
http://slidepdf.com/reader/full/cs252-8 7/7
Copyright M. Baltrush (CS252-8) 19
Instruction Set Influence
Condition Codes
• Flexibility in reordering – as few instructions
as possible change condition codes
• Compiler knows which instructions can
change condition codes
Copyright M. Baltrush (CS252-8) 20
Superscalar Operation
• Issue two instructions at once
– Requires multiple resources
– reorder buffer
– commitment unit
• Deadlocks