Slide #1 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Alpha 21164 MicroprocessorAlpha 21164 Microprocessor
The World’s Highest Performance Microprocessor
Zhihui Huang (Jerry)
University of Michigan
Slide #2 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Historical PerspectiveHistorical Perspective
CISC and Digital VAX (~1980) Serious exploration of RISC at Digital (1982) Fragmented efforts on RISC (1983~1984)
– SAFE, HR-32, CASCADE projects
First draft of the PRISM architecture (1985) Cancellation of PRISM (1988) First RISC workstation based on MIPS R2000(1989) Rename PRISM to Alpha (1990) First generation Alpha 21064 (1992) Second generation Alpha 21164 (1994)
Slide #3 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Alpha Microprocessor RoadmapAlpha Microprocessor Roadmap
1992 1993 1994 19961995 1997
5
10
15
20
21064 - 150 MHz21064 - 200 MHz
21064A - 275 MHz
21164 - 300 MHz21164 - 333 MHz
21164 - 366 MHz
21164 - 400 MHz21164 - 433 MHz
21164 - 500 MHz
Here We are
Slide #4 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
The 21164 ArchitectureThe 21164 Architecture
64-bit load and store RISC architecture Byte addressable
– 43-bit virtual address,40-bit physical address.
Integer Type: Byte,Word,Longword,Quadword. Floating-Point Data Types
– Longword integer format in floating-point unit– Quadword integer format in floating-point unit– IEEE and VAX floating-point format
CALL_PAL instruction
Slide #5 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
21164 Characteristics21164 Characteristics
m CMOS technology– 4 layers of metalization– 9.66 million transistor counts– 14.4mm x 14.5mm die size (209mm2)
Package and Power– 499-pin PGA, 291 signal pins– 3.3v external,2.2v internal– 37W@433MHz
Clock Frequency 300MHz ~ 500MHz– SPECint95 11.3~15.4 respectively– SPECfp95 14.5~21.1 respectively
Slide #6 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
On-chip Cache OrganizationOn-chip Cache Organization
An on-chip, 8KB primary instruction cache– direct mapped, 32-byte block(4 instructions)– virtual, 7-bit ASN(MAX_ASN = 127), 1-bit PALcode
An on-chip, 8KB primary data cache– dual-read-ported, single-write-ported– virtual indexed, physical tagged– write-through, read-allocate, direct mapped, 32-byte block
Large on-chip L2 cache– 96 KB, 3-way set associative, physical– write-back, write-allocate, byte-accessible– 32-byte(256-bit) or 64-byte(512-bit) block– mixed data and instruction cache– pipelined (16-byte per CPU cycle)
Slide #7 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
TLB OrganizationTLB Organization
Instruction Translation Buffer– 48-entry, fully associative– not-last-used replacement algorithm– 8KB to 4MB page– 2 superpages only in privileged mode
Data Translation Buffer– 64-entry, fully associative– dual-read-ported– not-last-used replacement algorithm– superpage
Slide #8 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
External InterfaceExternal Interface
Alpha21164
L1Cache
L2Cache
Support-- Oscillator
-- SerialROM
DRAMSIMM
sockets(X8)
BcacheDECchip 21172 Core Logic Chipset
cont
rol
DECchip 21172 - CIA
64-b
it
DECchip 21172 - BA 256-bit128-bit Data Bus
19-bitIndex
10-bit Tag
37-bit Address
control
PCI Bus
PCI-to-ISABridge
ISA Bus
Flash RomTime-of-Year
Keyboard/Mouse
2 IDE devicesDisketteParallel
2 serial ports
PCI Slots
Address/Control
Slide #9 February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Alpha 21164 Block DiagramAlpha 21164 Block Diagram
System Clock
Floating-Point Execution Unit (FBOX)
SROMInterface
Integer Register File (IRF) Floating Point Register File (FRF)
Clocks
Memory Address Translation Unit(MBOX)
DCACHE access control Miss Address File Write Buffer Dual Read Transaction Buffer
Instruction Cache (ICACHE)
Branch History Table TAG DATA
Data Cache(DCACHE)
TAG DATA
Scache Access Control External Bcache Control Bus Interface Unit
Cache Control and Bus Interface Unit (Cbox)
Adder
Multiplier
Divider
Integer Execution Unit (EBOX)
Multiplier
Adder
Shifter
Logic Box
Adder
Logic Box
Branch/Jump
Instruction Fetch/DecodeUnit (IBOX)
Prefetcher
Instruction Buffer
Instruction Slotter
Instruction Issue
Branch Prediction
Scache (L2 cache)
Tag Data
Set 3
Tag Data
Set 2
Tag Data
Set 1
Slide #10February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Microarchitecture Function unitsMicroarchitecture Function units
Instruction fetch and decode unit(IBOX)– Instruction prefetcher and instruction decoder– Branch prediction– Instruction translation buffer (ITB)– Interrupt support
Integer execution unit (Ebox) Floating-point execution unit(Fbox) Memory address translation unit (Mbox)
– Data Translation Buffer (DTB)– Miss Address File (MAF)– Write Buffer
Cache control and bus interface unit (Cbox)
Slide #11February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Instruction Issue Pipeline OrganizationInstruction Issue Pipeline Organization
InstructionCache(8KB)
PrefetchBuffer
BranchPredict
Next PC IssueConflict
Instruction Cache Instruction Buffer Instruction Slot Instruction Issue
S0 S1 S2 S3
Slide #12February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
S4
Execution Pipeline OrganizationExecution Pipeline Organization
IntegerRegs
Integer pipeline 0 : arith, logical,shift, load/store
Integer pipeline 1 : arith, logical,branch/jmp, load
Int Mult
Floating-pointRegs
FP pipeline 0 : add, subtract, compare,FP branch
FP pipeline 1 : multiply
FP divide
S3 S5 S6 S7 S8
Slide #13February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Memory Access PipelineMemory Access Pipeline
Dcache Access
Scache TagAccess
Scache DataAccess
Dcache Fill,Write Regs
Int Regs
Int Regs
FPRegs
S4 S5 S6 S7 S8 S10S9 S12S11
S3S2S1
S2S1S0 S4S3
Slide #14February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Instruction LatencyInstruction Latency
Instructions 21164 21064Most Integer operations 1 2
CMOV rdes,rsrc,rtest 2 3Integer 32-bit Multiply 8 19Integer 64-bit Multiply 16 23
Most FP operations 4 6FP single-precision divide 19 34FP double-precision divide 31 63
Loads Hit in L1 cache 2 ---Loads Hit in L2 cache 8 ---
Special Case 0
Slide #15February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Instruction Fetch/Issue UnitInstruction Fetch/Issue Unit
Branch and Jump Prediction– 2K entries Branch History Table (BHT)
• 2-bit saturate counter• built into Icache• not initialized on Icache fill
– Does not limit the number of branch predictions– 12-entry subroutine return stack
• store Icache index • PALmode and user mode prediction
– Mispredict trap • 4 ~ 5 cycles penalty on branch mispredict
Slide #16February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Instruction Prefetch Instruction Prefetch
L3Cache
CBOXL2
Cache(96 KB)
Dcache
MBOX
Integer Pipeline 0
Integer Pipeline 1
FP adder
FP Mult.
4-way
IssueUnit
Icache
PrefetchBuffer
Slide #17February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Instruction Decode/IssueInstruction Decode/Issue
Decode upto 4 instructions in parallel Check the structural hazard and data hazard Issue only the instructions without hazard Issue instructions IN ORDER Handle only NATURALLY ALLIGNED groups
of 4 instructions Does not advance until all 4 instructions are done No-op instruction is an important instruction
Slide #18February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
No-op InstructionsNo-op Instructions
Integer no-op– NOP (BIS R31,R31,R31)
Floating-point no-op– FNOP (CPYS F31,F31,F31)
Universal no-op– LDQ_U R31,...
Code example showing bad ordering
(a) LDL R2, 0(R1)(b) ADDL R2,R3,R4
(c) ADDL R2,R5,R6
Code example showing good ordering(a) LDL R2, 0(R1)(b) NOP
(c) ADDL R2,R3,R4(d) ADDL R2,R5,R6
Slide #19February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Code AnalysisCode Analysis
#define N 10main() { int i,j,temp; float a[N] = {1.0,3.0,5.0,2.0,9.0,0.0,4.0,8.0,7.0,6.0};
for (i=0;i<(N-1);i++) for (j=i+1;j<=(N-1);j++) if (a[i]<a[j]) {
temp = a[i];a[i] = a[j];a[j] = temp;
}}
Bubble Sort
Compiler Option: cc -newc -O4 -c -o bubble.o bubble.c
Slide #20February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Assembly Code in Groups(1)Assembly Code in Groups(1)
1st Group (0x0)– – ldah gp, 1(t12)– lda gp, -32528(gp)– lda sp, -48(sp)– cpys $f31,$f31,$f31
2nd Group (0x10)
– ldq a2, -32752(gp)– bis zero,ra,t11– ldq t12, -32744(gp)– bis zero, sp, a0
1st Pipeline States– t1 t2 t3 t4 t5 t6 t7– s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6– -- -- -- --
2nd Pipeline States– t1 t2 t3 t4 t5 t6 t7– s3 s4 s5 s6 – s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6
Slide #21February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Assembly Code in Groups(2)Assembly Code in Groups(2)
3rd Group (0x20)
– stq zero, 0(sp)– bis zero, 0x28, a1– bis zero, zero, t0– jsr ra, (t12), _Ots
4th Group (0x30)
– bis zero, 0x1, t1– subq t1, 0xa, t3– bge t3, 0x78– bis zero, t1, t2
3rd Pipeline States– t5 t6 t7 t8 t9 – s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6
4th Pipeline States– t30 t31 t32 t33 t34 t35– s3 s4 s5 s6– s3 s4 s5 s6 – s3 s4 s5 s6– s3 s4 s5 s6
Slide #22February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Assembly Code in Groups(3)Assembly Code in Groups(3)
5th Group (0x40)
– s4addq t1, sp, t5– lda t6, 36(sp)– s4addq t0, sp, t4– lds $f0, 0(t4)
6th Group (0x50)
– lds $f1, 0(t5)– cmptlt $f0, $f1, $f10– fbeq $f10, 0x64– sts $f1,0(t4)
5th Pipeline States– t33 t34 t35 t36 t37 t38– s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6 CV
6th Pipeline States– 35 36 37 38 39 40 41 42 43 44 – s3 s4 s5 s6 – s3 s4 s5 s6 s7 s8– s3 s4– s3 s4
Slide #23February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Assembly Code in Group (4)Assembly Code in Group (4)
7th Group (0x60)
– sts $f0, 0(t5)– lda t5, 4(t5)– cmpule t5,t6,t9– cpys $f31,$f31,$f31
8th Group (0x70)
– addl t2, 0x1, t2– bne t9, 0x4c– addl t1, 0x1, t1– subq t1, 0xa, t10
7th Pipeline States– 44 45 46 47 48 – s3 s4 s5 s6– s3 s4 s5 s6– s3 s4 s5 s6– -- -- -- --
8th Pipeline States– 46 47 48 49 50 51 52– s3 s4 s5 s6– s3 s4 s5 s6– s0 s1 s2 s3 s4 s5 s6– s0 s1 s2 s3 s4 s5
Slide #24February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Assembly Code in Groups (5)Assembly Code in Groups (5)
9th Group (0x80)
– addl t0, 0x1, t0– blt t10, 0x34– bis zero,t11,ra– bis zero,zero,v0
– lda sp, 48(sp)– ret zero, (ra), 1
9th Pipeline States– 52 52 54 55 56 57 58– s3 s4 s5 s6– s3 s4 s5 s6– s0 s1 s2 s3 s4– s0 s1 s2 s3 s4
– s0 s1 s2 s3– s0 s1 s2 s3
Slide #25February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
I-box Good and BadI-box Good and Bad
Good– instructions prefetch– low latency and high clock rate
Bad– high branch mispredict penalty– in order issue– naturally alligned issue– no stall after stage 4, replay every time when needs
stall
Slide #26February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
E-box Good and BadE-box Good and Bad
Good– low execution latency and high clock rate– supporting various floating-point format
Bad– LOAD/STORE multiplexed into Integer unit – one more stage for floating-point pipeline
What else ?
Slide #27February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Memory Unit OverviewMemory Unit Overview
Two-level Data Cache and a 64-entry DTB Memory Unit (Mbox)
– Load instruction and Miss Address File(MAF)• LDB,LDW,LDL,LDQ,LDL_L,LDQ_L ,LDS,LDT
– Store instruction and Write Buffer(WB)• STB,STW,STL,STQ,STL_C,STQ_C,STS,STT
– Memory Barrier (MB)– Write Memory Barrier (WMB)
Data Hazard and Replay Traps– Load After Store, Store After Store– MAF full and WB full
Slide #28February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Miss Address FileMiss Address File
Hold Load Misses in 6 Entries– physical address– destination register– instruction types
• integer/floating-point• 4-byte/8-byte/IEEE-S-Type/VAX-G-Type, etc.
Hold Instrction Fetch Address in 4 Entries– physical address
Slide #29February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Miss Address File DetailsMiss Address File Details
One on One Mapping? ?? LDL R2, 0(R1) and LDL R3,0(R1)
Same Size? ?? LDL R2,0(R1) and LDQ R3,8(R1)
Even with Even, Odd with Odd (LDL instruction only)? ?? LDL R2,0(R1) and LDL R3,12(R1)
Integer with Integer, FP with FP? ?? LDL R2,0(R1) and LDS FR2,8(R1)
Address1 Rn Rn Rn Rn Format
Address2 Rn Rn Rn Rn Format
Address3 Rn Rn Rn Rn Format
Address4 Rn Rn Rn Rn Format
Address5 Rn Rn Rn Rn Format
Address6 Rn Rn Rn Rn Format
0,4 8,12 16,20 24,28
32-byte per entry
Slide #30February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Data HazardData Hazard
Load after Store– (1 cycle later) Replay Trap (7 cycles penalty)– (2 cycles later) Issue Stalled – (Comliper Scheduled 3 cycles later) OK
Store after Load – Bits are set in each conflicting MAF entry to prevent its
fill from being placed in the Dcache when it arrives, and to prevent subsequent load from merging.
– Conflict bits are set with the store instruction in the write buffer to prevent the store instruction from being issued until all conflicting load instructions have been issued to Cbox
Slide #31February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
M-box Good and BadM-box Good and Bad
Good– non blocking– 2-level cache and large cache– merging for both load and store, reduce trafic– in order issue to the C-box and out of order completion
Bad– Replay every time when buffers are full, high penalty
What else ?
Slide #32February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Performance CharacterizationPerformance Characterization
espressoli
eqntottcompress
scgcc
spicedoduc
mdljdp2wave5
tomcatora
alvinnearmdljsp2
swm256su2corr
hydro2dnasa7
fppp
0 0.5 1 1.5 2 2.5 3
Percent of all cycles
Percentage of time in PALcodePercentage of time in PALcode
Slide #33February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Performance CharacterizationPerformance Characterization
0 20 40 60 80 100
espresso
eqntott
sc
spice
mdljdp2
tomcat
alvinn
mdljsp2
su2corr
nasa7
single
dual
triple
quad
Distribution of issue cycles for the Alpha 21164Distribution of issue cycles for the Alpha 21164
Slide #34February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Performance CharacterizationPerformance Characterization
espressoli
eqntottcompress
scgcc
spicedoduc
mdljdp2wave5
tomcatora
alvinnear
mdljsp2swm256
su2corrhydro2d
nasa7fppp
0 2 4 6 8 10 12 14
% branches mispredicted
Branch mispredictionsBranch mispredictions
Slide #35February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
Performance CharacterizationPerformance Characterization
espressoli
eqntottcompresssc
gccspice
doducmdljdp2
wave5tomcat
oraalvinn
earmdljsp2
swm256su2corr
hydro2dnasa7
fppp
0 20 40 60 80 100 120 140 160
Scache
Dcache
Icache
Cache misses per thousand instructions on the Alpha 21164Cache misses per thousand instructions on the Alpha 21164
Slide #36February 11, 1997
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
EECS 598 ---- Alpha EECS 598 ---- Alpha MicroprocessorMicroprocessor
Jerry Huang
ReferenceReference
Hardware Reference Manual– Digital Semiconductor 21164 Alpha Microprocessor
(order number : EC-QP99A-TE)
Alpha AXP Architecture Handbook– Digital Semiconductor
(order number : EC-QD2KA-TE)
Alpha Implementations and Architecture– D. Bhandarkar, Digital Press, QA 76.8 .A176 B471
Related materials for these slides– http://umaxp1.physics.lsa.umich.edu/~zhihuang
Recommended