64
Διασωλήνωση - Pipelining Pedro Trancoso Appendix B (και διαφάνειες από Prof. David Culler, Berkeley)

Δ ιασωλήνωση - Pipelining

  • Upload
    chaz

  • View
    47

  • Download
    7

Embed Size (px)

DESCRIPTION

Δ ιασωλήνωση - Pipelining. Pedro Trancoso Appendix B (και διαφάνειες από Prof. David Culler, Berkeley). Βιομηχανία Αυτοκινήτων. Βιομηχανία Αυτοκινήτων. Βιομηχανία Αυτοκινήτων. Βιομηχανία Αυτοκινήτων. Βιομηχανία Αυτοκινήτων. 1. Βιομηχανία Αυτοκινήτων. 2. 1. Βιομηχανία Αυτοκινήτων. 3. - PowerPoint PPT Presentation

Citation preview

Page 1: Δ ιασωλήνωση -  Pipelining

Διασωλήνωση - Pipelining

Pedro TrancosoAppendix B

(και διαφάνειες από Prof. David Culler, Berkeley)

Page 2: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

Page 3: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

Page 4: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

Page 5: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

Page 6: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

1

Page 7: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

2 1

Page 8: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

3 2 1

Page 9: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

4 3 2 1

Page 10: Δ ιασωλήνωση -  Pipelining

Βιομηχανία Αυτοκινήτων

5 4 3 2

Page 11: Δ ιασωλήνωση -  Pipelining

Ας πλύνουμε τα ρούχα! Σειριακή Μεθόδος

Sequential laundry takes 6 hours for 4 loads If they learned pipelining, how long would laundry take?

A

B

C

D

30 40 2030 40 2030 40 2030 40 20

6 PM 7 8 9 10 11 Midnight

Task

Order

Time

Page 12: Δ ιασωλήνωση -  Pipelining

Χρησιμοποιώντας Διασωλήνωση...

Pipelined laundry takes 3.5 hours for 4 loads

A

B

C

D

6 PM 7 8 9 10 11 Midnight

Task

Order

Time

30 40 40 40 40 20

Page 13: Δ ιασωλήνωση -  Pipelining

Μάθαμε ότι... Pipelining doesn’t help latency (χρόνος αναμονής) of single task, it helps throughput (ρυθμοαπόδοση) of entire workload

Pipeline rate limited by slowest pipeline stage

Multiple tasks operating simultaneously

Potential speedup = Number pipe stages

Unbalanced lengths of pipe stages reduces speedup

Time to “fill” pipeline and time to “drain” it reduces speedup

A

B

C

D

6 PM 7 8 9

Task

Order

Time

30 40 40 40 40 20

Page 14: Δ ιασωλήνωση -  Pipelining

Διασωλήνωση Εντολών Execute billions of instructions, so

throughput is what matters What is desirable in the ISA (ΑΣΕ) for

pipelining? Variable length instructions vs.

all instructions same length? Memory operands part of any operation vs.

memory operands only in loads or stores? Register operand (τελεστέος) many places in

instruction format vs. registers located in same place?

Page 15: Δ ιασωλήνωση -  Pipelining

Κύκλος Εκτέλεσης Instruction

Fetch

Instruction

Decode

Operand

Fetch

Execute

Result

Store

Next

Instruction

Obtain instruction from program storage

Determine required actions and instruction size

Locate and obtain operand data

Compute result value or status

Deposit results in storage for later use

Determine successor instruction

Page 16: Δ ιασωλήνωση -  Pipelining

Στάδια Διασωλήνωσης του DLX (1)

1. Instruction Fetch (IF)2. Instruction Decode / Register Fetch

(ID)3. Execution / Effective Address (EX)4. Memory Access / Branch Completion

(MEM)5. Write-back (WB)(go to 1!)

Page 17: Δ ιασωλήνωση -  Pipelining

Στάδια Διασωλήνωσης του DLX (2)

1. Instruction Fetch (IF)IR Mem[PC]NPC PC+4

2. Instruction Decode / Register Fetch (ID)A Regs[IR6..10]B Regs[IR11..15]Imm ((IR16)^16##IR16..31)

3. Execution / Effective Address (EX)Mem ref: ALU output A+ImmReg-reg (ALUop): ALUoutput A op BReg-imm (ALU op): ALU output A op ImmBranch: ALU output NPC+Imm

cond (A op 0)

Page 18: Δ ιασωλήνωση -  Pipelining

Στάδια Διασωλήνωσης του DLX (3)

4. Memory Access / Branch Completion (MEM)

Mem access: LMD Mem[ALU output] Mem[ALU output] B

Branch: if (cond) PC ALU output else PC NPC

5. Write-back (WB)Reg-reg ALU instr: Regs[IR16..20] ALU outputReg-imm ALU instr: Regs[IR11..15] ALU outputLoad instr: Regs[IR11..15] LMD

(go to 1!)

Page 19: Δ ιασωλήνωση -  Pipelining

Διασωλήνωση του DLX

Cycle number 1

Instruction I IF

Instruction I+1

Instruction I+2

Instruction I+3

Instruction I+4

Page 20: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2

Instruction I IF ID

Instruction I+1 IF

Instruction I+2

Instruction I+3

Instruction I+4

Page 21: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2 3

Instruction I IF ID EX

Instruction I+1 IF ID

Instruction I+2 IF

Instruction I+3

Instruction I+4

Page 22: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2 3 4

Instruction I IF ID EX MEM

Instruction I+1 IF ID EX

Instruction I+2 IF ID

Instruction I+3 IF

Instruction I+4

Page 23: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2 3 4 5

Instruction I IF ID EX MEM WB

Instruction I+1 IF ID EX MEM

Instruction I+2 IF ID EX

Instruction I+3 IF ID

Instruction I+4 IF

Page 24: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2 3 4 5 6

Instruction I IF ID EX MEM WB

Instruction I+1 IF ID EX MEM WB

Instruction I+2 IF ID EX MEM

Instruction I+3 IF ID EX

Instruction I+4 IF ID

Page 25: Δ ιασωλήνωση -  Pipelining

DLX Pipeline

Cycle number 1 2 3 4 5 6 7 8 9

Instruction I IF ID EX MEM WB

Instruction I+1 IF ID EX MEM WB

Instruction I+2 IF ID EX MEM WB

Instruction I+3 IF ID EX MEM WB

Instruction I+4 IF ID EX MEM WB

Page 26: Δ ιασωλήνωση -  Pipelining

Παράδειγμα: MIPS

Op

31 26 01516202125

Rs1 Rd immediate

Op

31 26 025

Op

31 26 01516202125

Rs1 Rs2

target

Rd Opx

Register-Register

561011

Register-Immediate

Op

31 26 01516202125

Rs1 Rs2/Opx immediate

Branch

Jump / Call

Page 27: Δ ιασωλήνωση -  Pipelining

5 Βήματα της Διόδου Δεδομένων (Datapath) του MIPS

MemoryAccess

Write

Back

InstructionFetch

Instr. DecodeReg. Fetch

ExecuteAddr. Calc

LMD

ALU

MU

X

Mem

ory

Reg File

MU

XM

UX

Data

Mem

ory

MU

X

SignExtend

4

Ad

der Zero?

Next SEQ PC

Addre

ss

Next PC

WB Data

Inst

RD

RS1

RS2

Imm

Page 28: Δ ιασωλήνωση -  Pipelining

MemoryAccess

Write

Back

InstructionFetch

Instr. DecodeReg. Fetch

ExecuteAddr. Calc

ALU

Mem

ory

Reg File

MU

XM

UX

Mem

ory

MU

X

SignExtend

Zero?

IF/ID

ID/E

X

MEM

/WB

EX

/MEM

4

Ad

der

Next SEQ PC Next SEQ PC

RD RD RD WB

Data

Next PC

Addre

ss

RS1

RS2

Imm

MU

X

Datapath

Control Path

5 Βήματα της Διόδου Δεδομένων (Datapath) του MIPS

Page 29: Δ ιασωλήνωση -  Pipelining

MemoryAccess

Write

Back

InstructionFetch

Instr. DecodeReg. Fetch

ExecuteAddr. Calc

ALU

Mem

ory

Reg File

MU

XM

UX

Mem

ory

MU

X

SignExtend

Zero?

IF/ID

ID/E

X

MEM

/WB

EX

/MEM

4

Ad

der

Next SEQ PC Next SEQ PC

RD RD RD WB

Data

Next PC

Addre

ss

RS1

RS2

Imm

MU

X

Datapath

Control Path

Inst

1

Inst

1

Inst

2

Inst

1

Inst

2

Inst

3

5 Βήματα της Διόδου Δεδομένων (Datapath) του MIPS

Page 30: Δ ιασωλήνωση -  Pipelining

Εκτέλεση με Διασωλήνωση

Instr.

Order

Time (clock cycles)

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Cycle 1Cycle 2 Cycle 3Cycle 4 Cycle 6Cycle 7Cycle 5

Page 31: Δ ιασωλήνωση -  Pipelining

Όρια της Διασωλήνωσης

Hazards (Κίνδυνοι): circumstances that would cause incorrect execution if next instruction were launched Structural hazards (κίνδυνος δομής):

Attempting to use the same hardware to do two different things at the same time

Data hazards (κίνδυνος δεδομένων): Instruction depends on result of prior instruction still in the pipeline

Control hazards (κίνδυνος ελέγχου): Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).

Page 32: Δ ιασωλήνωση -  Pipelining

Παράδειγμα: μια πόρτα μνήμης

Instr.

Order

Time (clock cycles)

Load

Instr 1

Instr 2

Instr 3

Instr 4

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Cycle 1Cycle 2 Cycle 3Cycle 4 Cycle 6Cycle 7Cycle 5

DMem

Structural Hazard

Page 33: Δ ιασωλήνωση -  Pipelining

Λύσεις για κίνδυνους δομής Defn: attempt to use same

hardware for two different things at the same time

Solution 1: Wait must detect the hazard must have mechanism to stall

Solution 2: Throw more hardware at the problem

Page 34: Δ ιασωλήνωση -  Pipelining

Εύρεση και λύση ενός κινδύνου δομής

Instr.

Order

Time (clock cycles)

Load

Instr 1

Instr 2

Stall

Instr 3

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Cycle 1Cycle 2 Cycle 3Cycle 4 Cycle 6Cycle 7Cycle 5

Reg

ALU

DMemIfetch Reg

Bubble Bubble Bubble BubbleBubble

Page 35: Δ ιασωλήνωση -  Pipelining

Λύση των Κίνδυνων Δομών στο σχεδιασμό

ALU

Instr

Cach

e

Reg File

MU

XM

UX

Data

Cach

e

MU

X

SignExtend

Zero?

IF/ID

ID/E

X

MEM

/WB

EX

/MEM

4

Ad

der

Next SEQ PC Next SEQ PC

RD RD RD WB

Data

Next PC

Addre

ss

RS1

RS2

Imm

MU

X

Datapath

Control Path

Page 36: Δ ιασωλήνωση -  Pipelining

Σημασία του ΑΣΕ στην λύση των Κινδύνων Δομής Simple to determine the sequence

of resources used by an instruction opcode tells it all

Uniformity in the resource usage Compare MIPS to IA32? MIPS approach => all instructions

flow through same 5-stage pipeling

Page 37: Δ ιασωλήνωση -  Pipelining

Instr.

Order

add r1,r2,r3

sub r4,r1,r3

and r6,r1,r7

or r8,r1,r9

xor r10,r1,r11

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Κίνδυνοι ΔεδομένωνTime (clock cycles)

IF ID/RF EX MEM WB

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Page 38: Δ ιασωλήνωση -  Pipelining

Read After Write (RAW) (Διάβασμα μετά από γράψιμο) InstrJ tries to read operand before InstrI writes it

Caused by a “Data Dependence” (εξάρτηση δεδομένων). This hazard results from an actual need for communication.

Τρεις Είδους Κίνδυνοι Δεδομένων

I: add r1,r2,r3J: sub r4,r1,r3

Page 39: Δ ιασωλήνωση -  Pipelining

Write After Read (WAR) (Γράψιμο μετά από διάβασμα) InstrJ writes operand before InstrI reads it

Called an “anti-dependence” by compiler writers. This results from reuse of the name “r1”.

Can it happen in the MIPS 5 stage pipeline? Can’t happen in MIPS 5 stage pipeline because:

All instructions take 5 stages, and Reads are always in stage 2, and Writes are always in stage 5

I: sub r4,r1,r3 J: add r1,r2,r3K: mul r6,r1,r7

Τρεις Είδους Κίνδυνοι Δεδομένων

Page 40: Δ ιασωλήνωση -  Pipelining

Write After Write (WAW) (Γράψιμο μετά από γράψιμο) InstrJ writes operand before InstrI writes it.

Called an “output dependence” (εξάρτηση εξόδου) by compiler writers. This also results from the reuse of name “r1”.

Can it happen in the MIPS 5 stage pipeline? Can’t happen in MIPS 5 stage pipeline because:

All instructions take 5 stages, and Writes are always in stage 5

Will see WAR and WAW in later more complicated pipes

I: sub r1,r4,r3 J: add r1,r2,r3K: mul r6,r1,r7

Τρεις Είδους Κίνδυνοι Δεδομένων

Page 41: Δ ιασωλήνωση -  Pipelining

Time (clock cycles)

Μεταβίβαση (Forwarding) για αποφυγή Κινδύνου Δεδομένων

Inst

r.

Order

add r1,r2,r3

sub r4,r1,r3

and r6,r1,r7

or r8,r1,r9

xor r10,r1,r11

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Page 42: Δ ιασωλήνωση -  Pipelining

Αλλαγές στο υλικό για Forwarding

MEM

/WR

ID/E

X

EX

/MEM

DataMemory

ALU

mux

mux

Registe

rs

NextPC

Immediate

mux

Page 43: Δ ιασωλήνωση -  Pipelining

Time (clock cycles)

Instr.

Order

lw r1,0(r2)

sub r4,r1,r6

and r6,r1,r7

or r8,r1,r9

Κίνδυνοι Δεδομένων και με Forwarding

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Page 44: Δ ιασωλήνωση -  Pipelining

Λύσεις…

Adding hardware? ... not Detection? Compilation techniques?

What is the cost of load delays?

Page 45: Δ ιασωλήνωση -  Pipelining

Time (clock cycles)

or r8,r1,r9

Instr.

Order

lw r1, 0(r2)

sub r4,r1,r6

and r6,r1,r7

Reg

ALU

DMemIfetch Reg

RegIfetch

ALU

DMem RegBubble

Ifetch

ALU

DMem RegBubble Reg

Ifetch

ALU

DMemBubble Reg

How is this different from the instruction issue stall?

Λύσεις του κινδύνου της εντολής φόρτωσης

Page 46: Δ ιασωλήνωση -  Pipelining

Χρονοδρομολόγηση Λογισμικού για αποφυγή κίνδυνου δεδομένωνSoftware Scheduling to Avoid Load Hazards

Fast code:LW Rb,bLW Rc,cLW Re,e ADD Ra,Rb,RcLW Rf,fSW a,Ra SUB Rd,Re,RfSW d,Rd

Try producing fast code for

a = b + c;

d = e – f;

assuming a, b, c, d ,e, and f in memory. Slow code:

LW Rb,b

LW Rc,c

ADD Ra,Rb,Rc

SW a,Ra

LW Re,e

LW Rf,f

SUB Rd,Re,Rf

SW d,Rd

STALL

STALL

Page 47: Δ ιασωλήνωση -  Pipelining

Σχέση με τη ΕΣΑ What is exposed about this organizational

hazard in the instruction set? k cycle delay?

bad, CPI is not part of ISA k instruction slot delay

load should not be followed by use of the value in the next k instructions

Nothing, but code can reduce run-time delays MIPS did the transformation in the assembler

Page 48: Δ ιασωλήνωση -  Pipelining

Κίνδυνος Ελέγχου από εντολές διακλάδωσης (Branches) => Three Stage Stall

10: beq r1,r3,36

14: and r2,r3,r5

18: or r6,r1,r7

22: add r8,r1,r9

36: xor r10,r1,r11

Reg ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Reg

ALU

DMemIfetch Reg

Page 49: Δ ιασωλήνωση -  Pipelining

Παράδειγμα: Branch Stall Impact If 30% branch, Stall 3 cycles significant

(CPI=?) Two part solution:

Determine branch taken or not sooner, AND Compute taken branch address earlier

MIPS branch tests if register = 0 or 0 MIPS Solution:

Move Zero test to ID/RF stage Adder to calculate new PC in ID/RF stage 1 clock cycle penalty for branch versus 3

Page 50: Δ ιασωλήνωση -  Pipelining

Ad

der

IF/ID

Διασωλήνωση της διόδου δεδομένων του MIPS

MemoryAccess

Write

Back

InstructionFetch

Instr. DecodeReg. Fetch

ExecuteAddr. Calc

ALU

Mem

ory

Reg File

MU

X

Data

Mem

ory

MU

X

SignExtend

Zero?

MEM

/WB

EX

/MEM

4

Ad

der

Next SEQ PC

RD RD RD WB

Data

• Data stationary control– local decode for each instruction phase / pipeline stage

Next PC

Addre

ss

RS1

RS2

ImmM

UX

ID/E

X

Page 51: Δ ιασωλήνωση -  Pipelining

Τέσσερις Επιλογές για Κίνδυνους Ελέγχου

#1: Stall until branch direction is clear#2: Predict Branch Not Taken

Execute successor instructions in sequence “Squash” instructions in pipeline if branch actually taken Advantage of late pipeline state update 47% MIPS branches not taken on average (CPI=?) PC+4 already calculated, so use it to get next instruction

#3: Predict Branch Taken 53% MIPS branches taken on average But haven’t calculated branch target address in MIPS

MIPS still incurs 1 cycle branch penalty Other machines: branch target known before outcome

Page 52: Δ ιασωλήνωση -  Pipelining

#4: Delayed Branch Define branch to take place AFTER a

following instruction

branch instructionsequential successor1

sequential successor2........sequential successorn

........branch target if taken

1 slot delay allows proper decision and branch target address in 5 stage pipeline

MIPS uses this but…

Branch delay of length n

Τέσσερις Επιλογές για Κίνδυνους Ελέγχου

Page 53: Δ ιασωλήνωση -  Pipelining

Καθυστερημένη Εντολή Διακλάδωσης (Delayed Branch)

Where to get instructions to fill branch delay slot? Before branch instruction From the target address: only valuable when branch taken From fall through: only valuable when branch not taken Canceling branches allow more slots to be filled

Compiler effectiveness for single branch delay slot: Fills about 60% of branch delay slots About 80% of instructions executed in branch delay slots

useful in computation About 50% (60% x 80%) of slots usefully filled

Delayed Branch downside: 7-8 stage pipelines, multiple instructions issued per clock (superscalar)

Page 54: Δ ιασωλήνωση -  Pipelining

Καθυστερημένη Εντολή Διακλάδωσης (Delayed Branch)

Page 55: Δ ιασωλήνωση -  Pipelining

Recall: Speed Up συνάρτηση για διασωλήνωση

pipelined

dunpipeline

TimeCycle

TimeCycle

CPI stall Pipeline CPI Idealdepth Pipeline CPI Ideal

Speedup

pipelined

dunpipeline

TimeCycle

TimeCycle

CPI stall Pipeline 1depth Pipeline

Speedup

Instper cycles Stall Average CPI Ideal CPIpipelined

For simple RISC pipeline, Ideal CPI = 1:

Page 56: Δ ιασωλήνωση -  Pipelining

Παράδειγμα: Αξιολόγηση της Διαφορετικής Υλοποίησης της Εντολής Διακλάδωσης

Assume: Conditional & Unconditional = 14%, 65% change PC

Scheduling Branch CPI speedup v. scheme penalty stall

Stall pipeline 3 1.42 1.0Predict taken 1 1.14 1.26Predict not taken 1 1.09 1.29Delayed branch 0.5 1.07 1.31

Pipeline speedup = Pipeline depth1 +Branch frequencyBranch penalty

Page 57: Δ ιασωλήνωση -  Pipelining

Άλλες Δυσκολίες... Χρόνος αναμονής της κρυφής

μνήμης Διακοπές και Εξαιρέσεις

(Interrupts and Exceptions) ΑΣΕ (π.χ. ADD3 42(R1),56(R1)+,@(R1)) Λειτουργίες Πολύ-κύκλων

(Multicycle Operations)

Page 58: Δ ιασωλήνωση -  Pipelining

Σχέση μεταξύ Κρυφής Μνήμης και Διασωλήνωσης

WB

Data

Ad

der

IF/ID

ALU

Mem

ory

Reg File

MU

X

Data

Mem

ory

MU

X

SignExtend

Zero? MEM

/WB

EX

/MEM

4

Ad

der

Next SEQ PC

RD RD RD

Next PC

Addre

ss

RS1

RS2

Imm

MU

X

ID/E

X

I-$ D-$

Memory

Page 59: Δ ιασωλήνωση -  Pipelining

Λειτουργίες Πολύ-κύκλων

Page 60: Δ ιασωλήνωση -  Pipelining

Λειτουργίες Πολύ-κύκλων

Page 61: Δ ιασωλήνωση -  Pipelining

Παράδειγμα: MIPS R4000

Στάδια (8): IF – IS – RF – EX – DF – DS – TC - WB

Page 62: Δ ιασωλήνωση -  Pipelining

Άλλα Παραδείγματα PowerPC G4 (4):

PowerPC G4e (7):

Pentium 4 (20):

FETCH DECODE / DISPATCH

EXECUTE COMPLETE / WRITE-BACK

FETCH FETCH DECODE / DISP

ISSUE EXE COMPL WRITE - BACK

TCNIP

TCNIP

TCF

TCF

DRIVE

A&R

A&R

A&R

Q SCHE

SCHE

SCHE

DISP

DISP

REG

REG

EXE

FLAGS

BRAN

DRIVE

Page 63: Δ ιασωλήνωση -  Pipelining

Άρθρα ISCA 2002

The Optimum Pipeline Depth for a MicroprocessorA. Hartstein, T. R. Puzak

The Optimal Useful Logic Depth Per Pipeline Stage is 6-8 FO4

M.S. Hrishikesh , Norman P. Jouppi , Keith I. Farkas , Doug Burger , Stephen W. Keckler, Premkishore Shivakumar

Increasing Processor Performance by Implementing Deeper Pipelines

Eric Sprangle , Doug Carmean

Page 64: Δ ιασωλήνωση -  Pipelining

Ασκήσεις H&P:

(1) a, b, c (2) a, b, c