DISEÑO DE UN REPERTORIO DE INSTRUCCIONES
M. C. Felipe Santiago Espinosa
Marzo/2018
Maestría en ElectrónicaArquitectura de Computadoras
Unidad 3
3.1. Clasificación de arquitecturas de los procesadores
n Históricamente han existido diferentes arquitecturas desde la perspectiva de cómo se trata a los operandos:n Arquitectura tipo acumulador: Maneja un registro importante para la
CPU >> el acumulador.n Arquitectura tipo pila: No se almacenan operandos dentro de la CPU
sino en una estructura de datos tipo Pila situada en la memoria.n Arquitectura registo-registro: Se cuentan con varios registros internos
al CPU para almacenamiento de operandos.• Dos opciones: 2 operandos y 3 operandos.
n Arquitectura registro-memoria: Similar a la anterior, pero un operando puede ser manipulado directamente en memoria.
Repertorio de Instrucciones 2
n El acumulador es un operando para la ALU y destino para los resultados.
Arquitectura tipo acumulador
MDR
F
MARPC
ALU
Memoria e Interface de
E/S
Unidad de
Control
IR
Acc
Repertorio de Instrucciones 3
n Estilo para las instrucciones en una arquitectura tipo acumulador:
Arquitectura tipo acumulador
INSTRUCCION OPERACIÓN REALIZADA
LOAD XLOAD (m)LOAD nSTORE XSTORE (m)ADD XADD (m)ADD n
Acc M(X) ; X es una variable de memoria (M)Acc M(m) ; m es una dirección de MAcc n ; n es un número entero.M(X) Acc ; X es una variable de MM(m) Acc ; m es una dirección de MAcc (Acc) + M(X) ; X es una variable de MAcc (Acc) + M(m) ; m es una dirección de MAcc (Acc) + n ; n es un número entero.
Repertorio de Instrucciones 4
n La memoria para los operandos es una estructura tipo Pila.
n Es necesario un registro para el control de la dinámica de la pila: SP (Stack Pointer)
n Debe incluir instrucciones para operaciones de Pila: PUSH y POP
n Las operaciones se hacen con los dos operandos ubicados en el extremos de la pila (TOS: Tope of the Stack y NOS: Next on the Stack) y el resultado se guarda en el tope de la pila.
Arquitectura tipo pila
Repertorio de Instrucciones 5
n Uso del SP para obtener los operandos:n TOS: Tope of the Stack.n NOS: Next on the Stack.
Arquitectura tipo pila
SP
F
MARPC
ALU
Unidadde
Control
IR
temp
Area de lamemoria para
pila
Resto de lamemoria
Repertorio de Instrucciones 6
n Estilo para las instrucciones en una arquitectura tipo pila:
Arquitectura tipo pila
Instrucción Operación
PUSH X TOS M(X)PUSH (m) TOS M(m)PUSH n TOS nPOP Z M(Z) TOSPOP (m) M(m) TOSADD (TOS’) = (TOS) + (NOS)SUB (TOS’) = (TOS) - (NOS)MUL (TOS’) = (TOS) * (NOS)DIV (TOS’) = (TOS) / (NOS)
Repertorio de Instrucciones 7
n Los registros proveen espacio para los operandos. La ALU únicamente opera con dos registros o un registro con una constante.
Arquitectura registro-registro
Repertorio de Instrucciones 8
Arquitectura registro-registron También se les conoce como Arquitecturas del tipo
Carga-Almacenamiento.n Existen dos variantes: Con instrucciones de 2
operandos e instrucciones con 3 operandos.n Si sólo son 2 operandos, uno de ellos es fuente y
destino a la vez (lectura destructiva)n Con tres operandos se incluyen todos los elementos
en la instrucción: dos operandos fuente y un operando destino
Repertorio de Instrucciones 9
n Estilo para las instrucciones en una arquitectura tipo registro-registro con 2 operandos:
MOV Rd, Rf ; Rd RfLOAD Rd, n ; Rd n es un númeroLOAD Rd, X ; Rd M(X) ; X es una variable en MLOAD Rd, (m) ; Rd M(m) ; m es una dirección en MSTORE X, Rf ; M(X) Rf ; X es una variable en MSTORE (m), Rf ; M(m) Rf ; m es una dirección en MADD Rf, Rd ; Rd Rf + Rd SUB Rf, Rd ; Rd Rf - RdMUL Rf, Rd ; Rd Rf * RdDIV Rf, Rd ; Rd Rf / Rd
Arquitectura registro-registro
Repertorio de Instrucciones 10
n La diferencia con la arquitectura anterior es que uno de los operandos se puede obtener directamente de memoria.
Arquitectura registro-memoria
MDR
F
MARPC
ALU
R0
R1
Rm-1
R2
Memoria eInterface de
E/S
Unidadde
Control
IR
Repertorio de Instrucciones 11
n El repertorio es más extenso. Es una arquitectura tipo CISC.
Arquitectura registro-memoria
MOV Rd, Rf ; Rd RfLOAD Rd, n ; Rd n es un númeroLOAD Rd, X ; Rd M(X) ; X es una variable en MLOAD Rd, (m) ; Rd M(m) ; m es una dirección en MSTORE X, Rf ; M(X) Rf ; X es una variable en MSTORE (m), Rf ; M(m) Rf ; m es una dirección en MADD Rd, Rf ; Rd Rd + RfSUB Rd, Rf ; Rd Rd - RfMUL Rd Rf ; Rd Rd * RfDIV Rd, Rf ; Rd Rd / RfADD Rd, X ; Rd Rd + M(X) ; X es una variable en MSUB Rd, X ; Rd Rd - M(X) ; X es una variable en MADD X, Rf ; Rf M(X) + Rf ; X es una variable en MSUB X, Rf ; Rf M(X) - Rf ; X es una variable en M
Repertorio de Instrucciones 12
Repertorio de Instrucciones
The MIPS Instruction Set
n Used as the example throughout the coursen Stanford University designed the MIPS processor, later it was
commercialized by MIPS Technologies (www.mips.com)n MIPS: Microprocessor without Interlocked Pipeline Stagesn Large share of embedded core market
n Applications in consumer electronics, network/storage equipment, cameras, printers, …
n Typical of many modern ISAsn Architecture of the register-register type of three operands
13
Repertorio de Instrucciones
Arithmetic Operations
n Add and subtract, three operandsn Two sources and one destination
add a, b, c # a gets b + c
n All arithmetic operations have this form
n Design Principle 1: Simplicity favours regularityn Regularity makes implementation simplern Simplicity enables higher performance at lower cost
14
Repertorio de Instrucciones
Arithmetic Example
n C code:f = (g + h) - (i + j);
MIPS code?
15
Register Operandsn Arithmetic instructions use register operandsn MIPS has a 32 × 32-bit register file
n Use for frequently accessed datan Numbered 0 to 31n 32-bit data called a “word”
n Assembler namesn $t0, $t1, …, $t9 for temporary valuesn $s0, $s1, …, $s7 for saved variables
n Design Principle 2: Smaller is fastern c.f. main memory: millions of locations
Repertorio de Instrucciones 16
Register Operand Example
n C code:f = (g + h) - (i + j);
n f, …, j in $s0, …, $s4
n MIPS code?
Repertorio de Instrucciones 17
Memory Operandsn Main memory used for composite data
n Arrays, structures, dynamic data
n To apply arithmetic operationsn Load values from memory into registersn Store result from register to memory
n Memory is byte addressedn Each address identifies an 8-bit byte
n Words are aligned in memoryn Address must be a multiple of 4
n MIPS is Big Endiann Most-significant byte at least address of a wordn c.f. Little Endian: least-significant byte at least address
Repertorio de Instrucciones 18
Repertorio de Instrucciones
Memory Operand Example 1
n C code:g = h + A[8];
n g in $s1, h in $s2, base address of A in $s3
n Compiled MIPS code:n Index 8 requires offset of 32
• 4 bytes per word
lw $t0, 32($s3) # load wordadd $s1, $s2, $t0
offset base register
19
Repertorio de Instrucciones
Memory Operand Example 2n C code:
A[12] = h + A[8];
n h in $s2, base address of A in $s3n Index 8 requires offset of 32n The complementary instruction (store word):
sw t0,4($t1)
n MIPS code?
20
Repertorio de Instrucciones
Registers vs. Memory
n Registers are faster to access than memoryn Operating on memory data requires loads
and storesn More instructions to be executed
n Compiler must use registers for variables as much as possiblen Only spill to memory for less frequently used variablesn Register optimization is important!
21
Repertorio de Instrucciones
Immediate Operands
n Constant data specified in an instructionaddi $s3, $s3, 4
n No subtract immediate instructionn Just use a negative constant
addi $s2, $s1, -1
n Design Principle 3: Make the common case fastn Small constants are commonn Immediate operand avoids a load instruction
22
Repertorio de Instrucciones
The Constant Zero
n MIPS register 0 ($zero) is the constant 0n Cannot be overwritten
n Useful for common operationsn E.g., move between registersn add $t2, $s1, $zero
23
Repertorio de Instrucciones
Unsigned Binary Integersn Given an n-bit number
00
11
2n2n
1n1n 2x2x2x2xx
n Range: 0 to +2n – 1n Example
n 0000 0000 0000 0000 0000 0000 0000 10112= 0 + … + 1×23 + 0×22 +1×21 +1×20
= 0 + … + 8 + 0 + 2 + 1 = 1110
n Using 32 bitsn 0 to +4,294,967,295
24
Repertorio de Instrucciones
2s-Complement Signed Integers
n Given an n-bit number0
01
12n
2n1n
1n 2x2x2x2xx
n Range: –2n – 1 to +2n – 1 – 1n Example
n 1111 1111 1111 1111 1111 1111 1111 11002= –1×231 + 1×230 + … + 1×22 +0×21 +0×20
= –2,147,483,648 + 2,147,483,644 = –410
n Using 32 bitsn –2,147,483,648 to +2,147,483,647
25
2s-Complement Signed Integersn Bit 31 is sign bit
n 1 for negative numbersn 0 for non-negative numbers
n –(–2n – 1) can’t be representedn Non-negative numbers have the same unsigned and
2s-complement representationn Some specific numbers
n 0: 0000 0000 … 0000n –1: 1111 1111 … 1111n Most-negative: 1000 0000 … 0000n Most-positive: 0111 1111 … 1111
Repertorio de Instrucciones 26
Repertorio de Instrucciones
Signed Negationn Complement and add 1
n Complement means 1 → 0, 0 → 1
x1x
11111...111xx 2
n Example: negate +2n +2 = 0000 0000 … 00102
n –2 = 1111 1111 … 11012 + 1 = 1111 1111 … 11102
27
Sign Extensionn Representing a number using more bits
n Preserve the numeric value
n In MIPS instruction setn addi: extend immediate valuen lb, lh: extend loaded byte/halfwordn beq, bne: extend the displacement
n Replicate the sign bit to the leftn c.f. unsigned values: extend with 0s
n Examples: 8-bit to 16-bitn +2: 0000 0010 => 0000 0000 0000 0010n –2: 1111 1110 => 1111 1111 1111 1110
Repertorio de Instrucciones 28
Representing Instructions
n Instructions are encoded in binaryn Called machine code
n MIPS instructionsn Encoded as 32-bit instruction wordsn Small number of formats encoding operation code (opcode), register
numbers, …n Regularity!
n Register numbersn $t0 – $t7 are reg’s 8 – 15n $t8 – $t9 are reg’s 24 – 25n $s0 – $s7 are reg’s 16 – 23
Repertorio de Instrucciones 29
MIPS R-format Instructions
n Instruction fieldsn op: operation code (opcode)n rs: first source register numbern rt: second source register numbern rd: destination register numbern shamt: shift amount (00000 for now)n funct: function code (extends opcode)
Repertorio de Instrucciones 30
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
Repertorio de Instrucciones
R-format Example
add $t0, $s1, $s2
special $s1 $s2 $t0 0 add
0 17 18 8 0 32
000000 10001 10010 01000 00000 100000
000000100011001001000000001000002 = 0232402016
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
31
Repertorio de Instrucciones
Hexadecimaln Base 16
n Compact representation of bit stringsn 4 bits per hex digit
0 0000 4 0100 8 1000 c 11001 0001 5 0101 9 1001 d 11012 0010 6 0110 a 1010 e 11103 0011 7 0111 b 1011 f 1111
n Example: eca8 6420n 1110 1100 1010 1000 0110 0100 0010 0000
32
Repertorio de Instrucciones
MIPS I-format Instructions
n Immediate arithmetic and load/store instructionsn rt: destination or source register numbern Constant: –215 to +215 – 1n Address: offset added to base address in rs
n Design Principle 4: Good design demands good compromisesn Different formats complicate decoding, but allow 32-bit instructions
uniformlyn Keep formats as similar as possible
op rs rt constant or address6 bits 5 bits 5 bits 16 bits
33
Instruction summary
Repertorio de Instrucciones 34
The operation code (opcode) is 0 for arithmetic and logic instructions between registers, they are differentiated by the funct field.
Instrution Format op rs rt rd shamnt funct address
add R 0 reg reg reg 0 32ten n.a.
sub (subtract) R 0 reg reg reg 0 34ten n.a.
addi I 8ten reg reg n.a. n.a. n.a. constant
lw (load word) I 35ten reg reg n.a. n.a. n.a. address
sw (store word) I 43ten reg reg n.a. n.a. n.a. address
Stored Program Computersn Instructions represented in
binary, just like datan Instructions and data stored in
memoryn Programs can operate on
programsn e.g., compilers, linkers, …
n Binary compatibility allows compiled programs to work on different computersn Standardized ISAs
Repertorio de Instrucciones 35
The BIG Picture
Repertorio de Instrucciones
Logical Operationsn Instructions for bitwise manipulation
Operation C Java MIPSShift left << << sll
Shift right >> >> srl
Bitwise AND & & and, andi
Bitwise OR | | or, ori
Bitwise NOT ~ ~ nor
n Useful for extracting and inserting groups of bits in a word
36
Repertorio de Instrucciones
Shift Operations
n shamt: how many positions to shift n Shift left logical
n Shift left and fill with 0 bitsn sll by i bits multiplies by 2i
n Shift right logicaln Shift right and fill with 0 bitsn srl by i bits divides by 2i (unsigned only)
op rs rt rd shamt funct6 bits 6 bits5 bits 5 bits 5 bits 5 bits
37
Repertorio de Instrucciones
AND Operationsn Useful to mask bits in a word
n Select some bits, clear others to 0
and $t0, $t1, $t2
0000 0000 0000 0000 0000 1101 1100 0000
0000 0000 0000 0000 0011 1100 0000 0000
$t2
$t1
0000 0000 0000 0000 0000 1100 0000 0000$t0
38
Repertorio de Instrucciones
OR Operationsn Useful to include bits in a word
n Set some bits to 1, leave others unchanged
or $t0, $t1, $t2
0000 0000 0000 0000 0000 1101 1100 0000
0000 0000 0000 0000 0011 1100 0000 0000
$t2
$t1
0000 0000 0000 0000 0011 1101 1100 0000$t0
39
NOT Operations
n Useful to invert bits in a wordn Change 0 to 1, and 1 to 0
n MIPS has NOR 3-operand instructionn a NOR b == NOT ( a OR b )
nor $t0, $t1, $zero
Repertorio de Instrucciones 40
0000 0000 0000 0000 0011 1100 0000 0000$t1
1111 1111 1111 1111 1100 0011 1111 1111$t0
Register 0: always read as zero
Repertorio de Instrucciones
Conditional Operations
n Branch to a labeled instruction if a condition is truen Otherwise, continue sequentially
n beq rs, rt, L1n if (rs == rt) branch to instruction labeled L1;
n bne rs, rt, L1n if (rs != rt) branch to instruction labeled L1;
n j L1n unconditional jump to instruction labeled L1
41
Repertorio de Instrucciones
Compiling If Statementsn C code:
if (i==j)
f = g+h;else
f = g-h;
n f, g, … in $s0, $s1, …
n MIPS code?
42
Repertorio de Instrucciones
Compiling Loop Statements
n C code:
while (save[i] == k) i += 1;
n i in $s3, k in $s5, address of save in $s6
n MIPS code?
43
Repertorio de Instrucciones
Basic Blocksn A basic block is a sequence of instructions with
n No embedded branches (except at end)n No branch targets (except at beginning)
n A compiler identifies basic blocks for optimization
n An advanced processor can accelerate execution of basic blocks
44
More Conditional Operations
n Set result to 1 if a condition is truen Otherwise, set to 0
n slt rd, rs, rtn if (rs < rt) rd = 1; else rd = 0;
n slti rt, rs, constantn if (rs < constant) rt = 1; else rt = 0;
n Use in combination with beq, bneslt $t0, $s1, $s2 # if ($s1 < $s2)bne $t0, $zero, L # branch to L
Repertorio de Instrucciones 45
Branch Instruction Design
n Why not blt, bge, etc?n Hardware for <, ≥, … slower than =, ≠
n Combining with branch involves more work per instruction, requiring a slower clock
n All instructions penalized!
n beq and bne are the common casen This is a good design compromise
Repertorio de Instrucciones 46
Signed vs. Unsigned
n Signed comparison: slt, sltin Unsigned comparison: sltu, sltuin Example
n $s0 = 1111 1111 1111 1111 1111 1111 1111 1111n $s1 = 0000 0000 0000 0000 0000 0000 0000 0001n slt $t0, $s0, $s1 # signed
• –1 < +1 $t0 = 1n sltu $t0, $s0, $s1 # unsigned
• +4,294,967,295 > +1 $t0 = 0
Repertorio de Instrucciones 47
MIPS Operands
Repertorio de Instrucciones 48
Name Example Comments
32 registers $s0, $s1, . . . , $s7$t0, $t1, . . . , $t7, $zero
Fast location for data. In MIPS, data must be in registers to perform arithmetic.Registers $s0-$s7 map to 16-23 and $t0-$t7 mao to 8-15. MIPS register $zero always equals 0.
230 memory words
Memory[0],Memory[4], . . . ,Memory[4294967292]
Accessed only by data transfer instructions in MIPS. MIPS uses byte addresses, so sequiential words differ by 4. Memory holds data structures, such arrays, and spilled registers.
Instructions
Repertorio de Instrucciones 49
Category Instruction Example Meaning Comments
Arithmeticadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands: data in registers
subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands: data in registers
Data transfer load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Data from memory to register
store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Data from register to memory
Conditional branch
branch on equal beq $s1, $s2, L if ($s1 == $s2) go to L Equal test and branch
branch on not equal bne $s1, $s2, L if ($s1 != $s2) go to L Not equal test and branch
set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0
Compare less than, used with beq, bne
Unconditional branch
jump j 2500 go to 10000 Jump to target address
jump register jr $t1 go to $t1 For switch statements
Repertorio de Instrucciones 50
Name Format Example Comments
add R 0 18 19 17 0 32 add $s1, $s2, $s3
sub R 0 18 19 17 0 34 sub $s1, $s2, $s3
lw I 35 18 17 100 lw $s1, 100($s2)
sw I 43 18 17 100 sw $s1, 100($s2)
beq I 4 17 18 25 beq $s1, $s2, 100
bne I 5 17 18 25 bne $s1, $s2, 100
slt R 0 18 19 17 0 42 slt $s1, $s2, $s3
j J 2 2500 j 10000
jr R 0 9 0 0 0 8 jr $t1
Field size 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits All MIPS instructions 32 bits
R-format R op rs rt rd shamt funct Arithmetic instruction format
I-format I op rs rt address Data transfer, branch format
Ejercicion El siguiente código C selecciona entre cuatro alternativas
dependiendo si el valor de k es 0, 1, 2 o 3: switch ( k ) {
case 0: f = i + h; break;
case 1: f = g + h; break;
case 2: f = g - h; break;
case 3: f = i - j; break;
}
Suponer que las seis variable f a k corresponden a los registros $s0 al $s5 ¿Cuál es el correspondiente código MIPS?
n Considere el uso de la instrucción: la $t0, Label (load address), encargada de obtener la dirección de una etiqueta empleada en el programa.
Repertorio de Instrucciones 51
Tarea1. Obtener el código MIPS de la asignación: x[10] = x[11] + c;
n Si el inicio del arreglo x está en $s0 y c está en $t0.2. Escriba el código máquina generado para el ejercicio anterior. 3. Con el ensamblador MIPS, indique la secuencia de instrucciones que
evalúe a los registros $s0, $s1 y $s2 y deje el valor del menor en $s3. (Los registros $s0, $s1 y $s2 deben conservar su valor).
4. Escriba el código máquina generado para el ejercicio 3. 5. El siguiente código C acumula los valores del arreglo A en la variable x:
for ( x = 0, i = 0; i < n; i++ )
x = x + A[i]; ¿Cuál es el código MIPS para este código? n Suponga que el comienzo del arreglo A esta en el registro $s1, que el registro $s2
contiene el valor de n, que la variable x se asocia con $s3 y para la variable i utilice $t0.
Repertorio de Instrucciones 52
Tarea
6. Transforme la siguiente asignación: c = ( a > b ) ? a : b;
a código MIPS. Asocie a, b y c con $s0, $s1 y $s2, respectivamente.
Repertorio de Instrucciones 53
Repertorio de Instrucciones
Procedure Calling
n Steps required1. Place parameters in registers2. Transfer control to procedure3. Acquire storage for procedure4. Perform procedure’s operations5. Place result in register for caller6. Return to place of call
54
Repertorio de Instrucciones
Register Usage
55
Register 1, called $at, is reserved for the assembler, and registers 26–27, called $k0–$k1, are reserved for the operating system.
Procedure Call Instructions
n Procedure call: jump and linkjal ProcedureLabel
– Address of following instruction put in $ra– Jumps to target address
n Procedure return: jump registerjr $ra
– Copies $ra to program counter– Can also be used for computed jumps
• e.g., for case/switch statements
Repertorio de Instrucciones 56
Repertorio de Instrucciones
Leaf Procedure Example
n C code:int leaf_example (int g, h, i, j) {
int f; f = (g + h) - (i + j); return f;}
– Arguments g, …, j in $a0, …, $a3– f in $s0 (hence, need to save $s0 on stack)– Result in $v0– The stack grows down (subtract to gain space)
57
Repertorio de Instrucciones
Non-Leaf Procedures
n Procedures that call other proceduresn For nested call, caller needs to save on the stack:
– Its return address– Any arguments and temporaries needed after the call
n Restore from the stack after the call
58
Repertorio de Instrucciones
Non-Leaf Procedure Example
n C code:int fact (int n){ if (n < 1)
return f; else return n * fact(n - 1);}
Argument n in $a0– Result in $v0
n MIPS code?
59
Repertorio de Instrucciones
Non-Leaf Procedure Examplen MIPS code:
60
fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument
slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and return
L1: addi $a0, $a0, -1 # else decrement n jal fact # recursive call
lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack
mul $v0, $a0, $v0 # multiply to get result jr $ra # and return
Repertorio de Instrucciones
Local Data on the Stack
n Local data allocated by callee– e.g., C automatic variables
n Procedure frame (activation record)– Used by some compilers to manage stack storage
61
Repertorio de Instrucciones
Memory Layoutn Text: program coden Static data: global variables
– e.g., static variables in C, constant arrays and strings
– $gp initialized to address allowing ±offsets into this segment
n Dynamic data: heap– E.g., malloc in C, new in Java
n Stack: automatic storage
62
Repertorio de Instrucciones
Character Data
n Byte-encoded character sets– ASCII: 128 characters
• 95 graphic, 33 control
– Latin-1: 256 characters• ASCII, +96 more graphic characters
n Unicode: 32-bit character set– Used in Java, C++ wide characters, …– Most of the world’s alphabets, plus symbols– UTF-8, UTF-16: variable-length encodings
63
Repertorio de Instrucciones
Byte/Halfword Operations
n Could use bitwise operationsn MIPS byte/halfword load/store
– String processing is a common caselb rt, offset(rs) lh rt, offset(rs)
– Sign extend to 32 bits in rtlbu rt, offset(rs) lhu rt, offset(rs)
– Zero extend to 32 bits in rtsb rt, offset(rs) sh rt, offset(rs)
– Store just rightmost byte/halfword
64
Repertorio de Instrucciones
String Copy Example
n C code (naïve):– Null-terminated stringvoid strcpy (char x[], char y[]) {
int i; i = 0; while ((x[i]=y[i])!='\0') i += 1;}
– Addresses of x, y in $a0, $a1– i in $s0
n MIPS code?65
Repertorio de Instrucciones
String Copy Example
n MIPS code:
strcpy: add $t0, $zero, $zero # i = 0L1: add $t1, $t0, $a1 # addr of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $t0, $a0 # addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $t0, $t0, 1 # i = i + 1 j L1 # next iteration of loopL2: jr $ra # and return
66
Repertorio de Instrucciones
0000 0000 0111 1101 0000 0000 0000 0000
32-bit Constantsn Most constants are small
n 16-bit immediate is sufficient
n For the occasional 32-bit constantlui rt, constant
n Copies 16-bit constant to left 16 bits of rtn Clears right 16 bits of rt to 0
lui $s0, 61
0000 0000 0111 1101 0000 1001 0000 0000ori $s0, $s0, 2304
67
Branch Addressingn Branch instructions specify
n Opcode, two registers, target address
n Most branch targets are near branchn Forward or backward
Repertorio de Instrucciones 68
op rs rt constant or address6 bits 5 bits 5 bits 16 bits
n PC-relative addressingn Target address = PC + offset × 4n PC already incremented by 4 by this time
Repertorio de Instrucciones
Jump Addressingn Jump (j and jal) targets could be anywhere in text segment
n Encode full address in instruction
op address6 bits 26 bits
n (Pseudo)Direct jump addressingn Target address = PC31…28 : (address × 4)
69
Repertorio de Instrucciones
Target Addressing Examplen Loop code from earlier example
n Assume Loop at location 80000
Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0
add $t1, $t1, $s6 80004 0 9 22 9 0 32
lw $t0, 0($t1) 80008 35 9 8 0
bne $t0, $s5, Exit 80012 5 8 21 2
addi $s3, $s3, 1 80016 8 19 19 1
j Loop 80020 2 20000
Exit: … 80024
70
Repertorio de Instrucciones
Branching Far Away
n If branch target is too far to encode with 16-bit offset, assembler rewrites the code
n Examplebeq $s0,$s1, L1
↓
bne $s0,$s1, L2j L1
L2: …
71
Addressing Mode
Summary
Repertorio de Instrucciones 72
Repertorio de Instrucciones
C Sort Examplen Illustrates use of assembly instructions for a C bubble sort
functionn Swap procedure (leaf)
void swap(int v[], int k){ int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;}
v in $a0, k in $a1, temp in $t0
73
Repertorio de Instrucciones
The Procedure Swapswap: sll $t1, $a1, 2 # $t1 = k * 4 add $t1, $a0, $t1 # $t1 = v+(k*4) # (address of v[k]) lw $t0, 0($t1) # $t0 (temp) = v[k] lw $t2, 4($t1) # $t2 = v[k+1] sw $t2, 0($t1) # v[k] = $t2 (v[k+1]) sw $t0, 4($t1) # v[k+1] = $t0 (temp) jr $ra # return to calling routine
74
Repertorio de Instrucciones
The Sort Procedure in Cn Non-leaf (calls swap)
void sort (int v[], int n){
int i, j;
for (i = 0; i < n; i += 1) {
for (j = i – 1;
j >= 0 && v[j] > v[j + 1];
j -= 1) {
swap(v,j);
}
}
}
v in $a0, k in $a1, i in $s0, j in $s1
75
move $s2, $a0 # save $a0 into $s2 move $s3, $a1 # save $a1 into $s3
move $s0, $zero # i = 0for1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n) beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n) addi $s1, $s0, –1 # j = i – 1
for2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0) bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0) sll $t1, $s1, 2 # $t1 = j * 4 add $t2, $s2, $t1 # $t2 = v + (j * 4) lw $t3, 0($t2) # $t3 = v[j] lw $t4, 4($t2) # $t4 = v[j + 1] slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3 beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3 move $a0, $s2 # 1st param of swap is v (old $a0) move $a1, $s1 # 2nd param of swap is j jal swap # call swap procedure addi $s1, $s1, –1 # j –= 1 j for2tst # jump to test of inner loop
exit2: addi $s0, $s0, 1 # i += 1 j for1tst # jump to test of outer loop
Moveparams
Outer loop
Inner loop
Inner loop
Outer loop
Passparams& call
Repertorio de Instrucciones 76
Repertorio de Instrucciones
sort: addi $sp,$sp, –20 # make room on stack for 5 registers sw $ra, 16($sp) # save $ra on stack sw $s3,12($sp) # save $s3 on stack sw $s2, 8($sp) # save $s2 on stack sw $s1, 4($sp) # save $s1 on stack sw $s0, 0($sp) # save $s0 on stack … # procedure body … exit1: lw $s0, 0($sp) # restore $s0 from stack lw $s1, 4($sp) # restore $s1 from stack lw $s2, 8($sp) # restore $s2 from stack lw $s3,12($sp) # restore $s3 from stack lw $ra,16($sp) # restore $ra from stack addi $sp,$sp, 20 # restore stack pointer jr $ra # return to calling routine
The Full Procedure
77
Repertorio de Instrucciones
Effect of Compiler OptimizationCompiled with gcc for Pentium 4 under Linux
78
Repertorio de Instrucciones
Effect of Language and Algorithm
79
Lessons Learnt
n Instruction count and CPI are not good performance indicators in isolation
n Compiler optimizations are sensitive to the algorithm
n Java/JIT compiled code is significantly faster than JVM interpretedn Comparable to optimized C in some cases
n Nothing can fix a dumb algorithm!
Repertorio de Instrucciones 80
Arrays vs. Pointers
n Array indexing involvesn Multiplying index by element sizen Adding to array base address
n Pointers correspond directly to memory addressesn Can avoid indexing complexity
Repertorio de Instrucciones 81
Repertorio de Instrucciones
Example: Clearing and Arrayclear1(int array[], int size) { int i; for (i = 0; i < size; i += 1) array[i] = 0;}
clear2(int *array, int size) { int *p; for (p = &array[0]; p < &array[size]; p = p + 1) *p = 0;}
Clear1:
move $t0,$zero # i = 0
loop1: slt $t3,$t0,$a1 # $t3 =
# (i < size)
beq $t3,$zero,exit
sll $t1,$t0,2 # $t1 = i * 4
add $t2,$a0,$t1 # $t2 =
# &array[i]
sw $zero, 0($t2) # array[i] = 0
addi $t0,$t0,1 # i = i + 1
j loop1
exit: jr $ra
Clear2:
move $t0,$a0 # p = & array[0]
sll $t1,$a1,2 # $t1 = size * 4
add $t2,$a0,$t1 # $t2 =
# &array[size]
loop2: slt $t3,$t0,$t2 # $t3 =
#(p<&array[size])
beq $t3,$zero,exit
sw $zero,0($t0) # Memory[p] = 0
addi $t0,$t0,4 # p = p + 4
j loop2
exit: jr $ra
82
Comparison of Array vs. Ptr
n Multiply “strength reduced” to shiftn Array version requires shift to be inside loop
n Part of index calculation for incremented in c.f. incrementing pointer
n Compiler can achieve same effect as manual use of pointersn Induction variable eliminationn Better to make program clearer and safer
Repertorio de Instrucciones 83
Concluding Remarks
n Design principles1. Simplicity favors regularity2. Smaller is faster3. Make the common case fast4. Good design demands good compromises
n Layers of software/hardwaren Compiler, assembler, hardware
n MIPS: typical of RISC ISAs
Repertorio de Instrucciones 84
MIPS Operands
Repertorio de Instrucciones 85
Name Example Comments
32 registers$s0-$s7, $t0-$t9, $zero$a0-$a3, $v0-$v1, $gp$fp, $sp, $ra, $at
Fast location for data. In MIPS, data must be in registers to perform arithmetic. MIPS register $zero always equals 0. Register $at is reserved for the assembler to handle large constants.
230 memory words
Memory[0],Memory[4], . . . ,Memory[4294967292]
Accessed only by data transfer instructions in MIPS. MIPS uses byte addresses, so sequiential words differ by 4. Memory holds data structures, such as arrays, and spilled registers, such as those saved on procedure calls.
Repertorio de Instrucciones 86
Category Instruction Example Meaning Comments
Arithmeticadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands: data in registers
subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands: data in registers
Data transfer
load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Word from memory to register
store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory
load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100] Byte from memory to register
store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memory
load upper immediate lui $s1, 100 $s1 = 100 << 16 Loads constant in upper 16 bits
Conditional branch
branch on equal beq $s1, $s2, L if ($s1 == $s2) go to L Equal test and branch
branch on not equal bne $s1, $s2, L if ($s1 != $s2) go to L Not equal test and branch
set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0
Compare less than, used with beq, bne
set on less than immediate slt $s1, $s2, 100 if ($s2 < 100) $s1 = 1;
else $s1 = 0Compare less than constant
Unconditional branch
jump j 2500 go to 10000 Jump to target address
jump register jr $ra go to $ra For switch, procedure return
jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call
MIPS assembly language
Tarea1. Realizar un procedimiento en C que devuelva el mayor de un arreglo de n
elementos.
2. Trasladar el resultado del ejemplo anterior a código MIPS, respetando las convenciones establecidas para la asociación de registros con variables.
3. Escribir un procedimiento bfind, en lenguaje ensamblador MIPS, que reciba como argumento un apuntador a una cadena terminada con NULL (correspondería a $a0) y localice la primer letra b en la cadena, de manera que el procedimiento debe devolver la dirección de esta primera aparición (se regresaría en $v0). Si no hay b’s en la cadena, entonces bfind deberá regresar un apuntador al carácter nulo (localizado al final de la cadena). Por ejemplo, si bfind recibe como argumento un apuntador a la cadena «embebido» deberá devolver un apuntador al tercer carácter en la cadena.
Repertorio de Instrucciones 87
4. Escribir un procedimiento bcount, en lenguaje ensamblador MIPS, que reciba como argumento un apuntador a una cadena terminada con NULL (correspondería a $a0) y devuelva el número de b’s que aparecen en la cadena (en el registro $v0). Para la implementación de bcount deberá utilizarse la función bfind desarrollada en el ejercicio anterior.
5. Escribir un procedimiento en código MIPS para calcular el n-ésimo término de la serie de Fibonacci (F(n)), donde:
F(0) = 0
F(1) = 1
F(n) = F(n – 1) + F(n – 2) Si n > 1
Con base en el procedimiento recursivo: int fib( int n ) {
if ( n == 0 || n == 1 )
Return n;
return fib( n – 1) + fib( n – 2);
}
Repertorio de Instrucciones 88
SPIM
n Un simulador para el repertorio de instrucciones MIPS.
– Revisar el documento de ayuda.– Desarrollar los ejercicios que se muestran en el
documento de ayuda.
Repertorio de Instrucciones 89