Upload
vuphuc
View
214
Download
0
Embed Size (px)
Citation preview
Disclaimer
• Some of these slides are made by Hennesy and Pa@erson from Computer Architecture a Quan(a(ve approach and Computer Organiza(on and Design, The Hardware/ SoHware Interface.
WARNING!!
• You should review the mathema(cal binary, hexadecimal, decimal and octagonal. Also the transforma(on between these.
• This session is ONLY to understand machine arithme(c from architecture point of view.
Arithme(c
• Numbers and number representa(on • Addi(on and Subtrac(on • Mul(plica(on
• Division • Floa(ng point arithme(c
Numbers and their representa(on
• Representa(on – ASCII -‐ text characters
• Easy read and write of numbers • Complex arithme(c (character wise)
– Binary number • Natural form for computers • Requires formaYng rou(nes for I/O
• in MIPS (32-‐bit number): – Least significant bit is right (bit 0) – Most significant bit is leH (bit 31)
Number types
• Integer numbers, unsigned – Address calcula(ons – Numbers that can only be posi(ve
• Signed numbers – Posi(ve – Nega(ve
• Floa(ng point numbers – numeric calcula(ons – Different grades of precision
• Singe precision (IEEE) • Double precision (IEEE)
Number representa(ons
• Bits are just bits (no inherent meaning) – Conven(ons / contexts define rela(onship between bits and
numbers
• Binary numbers (base 2). – E.g. 1011
• Things may become complicated. – Numbers are finite (overflow) – Frac(on and real numbers – Nega(ve numbers
• How do we represent nega(ve numbers?
Binary and Bits
• Computer Words are composed of bits, thus, words can be represented as binary numbers.
0 0 0 0 numerical value 110 1 0 0 0 numerical value 210 0 1 0 0 numerical value 310 1 1 0 0 numerical value 410
8, 16, 32 bits Representa(on
The 710 number: • 8-bits: 0000 0111two
• 16-bits: 0000 0000 0000 0111two
• 32-bits: 0000 0000 0000 0000 0000 0000 0000 0111two
Possible Representa(ons
• Sign and Magnitude – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐0, 101 = -‐1, 110 = -‐2, 111 = -‐3
• One’s Complement – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐3, 101 = -‐2, 110 = -‐1, 111 = -‐0
• Two’s complement – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐4, 101 = -‐3, 110 = -‐2, 111 = -‐1 – How to evaluate?
2's complement
Only one representa(on for 0
One more nega(ve number than posi(ve number
0000
0111
0011
1011
1111 1110
1101
1100
1010
1001 1000
0110
0101
0100
0010
0001
+0 +1
+2
+3
+4
+5 +6
+7 -8 -7
-6
-5
-4
-3 -2
-1
0 100 = + 4
1 100 = - 4
+
-
Addi(on and Substrac(on
• Addi(on: Digits are added bit by bit from right to leH, with carries passed to the next digit to the leH, just as you would do by hand.
• Substrac(on: Uses Addi(on. The appropiate operand is simple negated before being added.
• Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110 - 0110 - 0101
• Two's complement opera(ons easy
– subtrac(on using addi(on of negated numbers 0111 + 1010
• Overflow (result too large for finite computer word):
– e.g., adding two n-‐bit numbers does not yield an n-‐bit number 0111 + 0001 note that overflow term is somewhat misleading, 1000 it does not mean a carry “overflowed”
Addi(on and Subtrac(on
Overflow
• Overflow occurs when the result from an opera(on cannot be represented with the available hardware.
Overflow
• The sum of two unsigned numbers can exceed any representa(on
• The opera(on on 2’s complement numbers can exceed any representa(on
•
1000 0001
1111 1110
0111 1111
-127
-2
+127
+
1111 1111
1111 1010
11111 1001
255
250
249
+
Overflow
• General overflow condi(ons of 2’s complement numbers
• No overflow when adding a posi(ve and a nega(ve number • No overflow when signs are the same for subtrac(on • Consider the opera(ons A + B, and A – B
– Can overflow occur if B is 0 ? – Can overflow occur if A is 0 ?
A + B A + B A - B A - B
A > 0 A < 0 A > 0 A < 0
B > 0 B < 0 B < 0 B > 0
< 0 > 0 < 0 > 0
Operation R esult Conditions
Overflow
• Hardware detec(on in the ALU • Genera(on of an excep(on (interrupt) • Interrupted address is saved for possible resump(on
• Jump to predefined address based on type of excep(on – Correct & return to program – Return to program with error code – Abort program
Overflow
• Signed integers cause overflow – add – add immediate (addi) – subtract (sub)
• Overflow in unsigned integers: not considered as serious in MIPS – add unsigned (addu) – add immediate unsigned (addiu) – Subtract unsigned (subu) – Handle with care!
Overflow
• Excep(on is called interrupt: an unscheduled event that disrupt program execu(on, used to detect overflow.
• Interrupt: An excep(on that comes from outside of the process.
Integer Addi(on • Example: 7 + 6
Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands
Overflow if result sign is 1
Adding two –ve operands Overflow if result sign is 0
Integer Subtrac(on
• Add nega(on of second operand • Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111 –6: 1111 1111 … 1111 1010 +1: 0000 0000 … 0000 0001
• Overflow if result out of range – Subtrac(ng two +ve or two –ve operands, no overflow – Subtrac(ng +ve from –ve operand
• Overflow if result sign is 0 – Subtrac(ng –ve from +ve operand
• Overflow if result sign is 1
Dealing with Overflow
• Some languages (e.g., C) ignore overflow – Use MIPS addu, addui, subu instruc(ons
• Other languages (e.g., Ada, Fortran) require raising an excep(on – Use MIPS add, addi, sub instruc(ons – On overflow, invoke excep(on handler
• Save PC in excep(on program counter (EPC) register • Jump to predefined handler address • mfc0 (move from coprocessor reg) instruc(on can retrieve EPC value, to return aHer correc(ve ac(on
Arithme(c for Mul(media
• Graphics and media processing operates on vectors of 8-‐bit and 16-‐bit data – Use 64-‐bit adder, with par((oned carry chain
• Operate on 8×8-‐bit, 4×16-‐bit, or 2×32-‐bit vectors – SIMD (single-‐instruc(on, mul(ple-‐data)
• Satura(ng opera(ons – On overflow, result is largest representable value
• c.f. 2s-‐complement modulo arithme(c
– E.g., clipping in audio, satura(on in video
• More complicated than addi(on – accomplished via shiHing and addi(on
• More (me and more space
• Let's look at 3 versions based on grade school algorithm
1000 (multiplicand) __x_1001 (multiplier)
• See three revisions of the implementa(on
Mul(plica(on
Mul(plica(on
• Basic func(ons: – Mul(plica(on of the mul(plicand with the last digit of the mul(plier, with the next digit, ...
– Adding up the par(al products • Further simplifica(on in binary system:
– ShiH – Either add mul(plicand or add zero.
Mul(plica(on
• Binary mul(plica(on Multiplicand * Multiplier 1000 * 1001 1000 0000 0000 1000 . Product 1001000
• Look at current bit posi(on of the mul(plier – If mul(plier is 1
• then add mul(plicand • Else add 0
– shiH mul(plicand to the leH by 1 bit
Mul(plica(on • Start with long-‐mul(plica(on approach
1000 × 1001 1000 0000 0000 1000 1001000
Length of product is the sum of operand lengths
mul(plicand
mul(plier
product
Mul(plica(on: Implementa(on
• Check mul(plier to determine whether the mul(plicand is added to the product register
• LeH shiH the mul(plicand register 1 bit • Right shiH the mul(plier register 1 bit • Very big too slow!
• Example 2x3
• Product Mul(plier Mul(plicand 0000 0000 0011 0000 0010
• 0000 0010 0001 0000 0100 • 0000 0110 0000 0000 1000
• 0000 0110
Second Version
• If mul(plier0 = 1, add mul(plicand to the leH 32 bit of the product register
• ShiH the product register right 1 bit
• ShiH the mul(plier register right 1 bit
• Example
Op(mized Mul(plier • Perform steps in parallel: add/shiH
One cycle per par(al-‐product addi(on That’s ok, if frequency of mul(plica(ons is low
Final Version
• Further op*miza*on • At the ini*al state the product
register contains only '0'
• The lower 32 bits are simply shi>ed out
• Idea: – use these 32 bits for the
mul*plier and check the lowest bit of the product register
– if add & shi> or shi> only
Final Version
• If product0 = 1, add mul(plicand to the leH 32 bit of the product register
• ShiH the product register right 1 bit
Faster Mul(plier • Uses mul(ple adders
– Cost/performance tradeoff
Can be pipelined Several mul(plica(on performed in parallel
Signed Mul(plica(on
• Basic approach: – Store the signs of the operands – Convert signed numbers to unsigned numbers (most significant bit (MSB) = 0)
– Perform mul(plica(on
– If sign bits of operands as equal • sign bit = 0, else • Negate the product
MIPS Mul(plica(on
• Two 32-‐bit registers for product – HI: most-‐significant 32 bits – LO: least-‐significant 32-‐bits
• Instruc(ons – mult rs, rt / multu rs, rt
• 64-‐bit product in HI/LO – mfhi rd / mflo rd
• Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits
– mul rd, rs, rt • Least-‐significant 32 bits of product –> rd
Division • Check for 0 divisor • Long division approach
– If divisor ≤ dividend bits • 1 bit in quo(ent, subtract
– Otherwise • 0 bit in quo(ent, bring down next dividend bit
• Restoring division – Do the subtract, and if remainder goes <
0, add divisor back
• Signed division – Divide using absolute values – Adjust sign of quo(ent and remainder as
required
1001 1000 1001010 -1000 10 101 1010 -1000 10
n-‐bit operands yield n-‐bit quo(ent and remainder
quo(ent
dividend
remainder
divisor
Division Version 1
• 64-‐bit Divisor reg, 64-‐bit ALU, 64-‐bit Remainder reg, 32-‐bit Quo(ent reg
Remainder
Quotient
Divisor
64-bit ALU
Shift Right
Shift Left
Write Control
32 bits
64 bits
64 bits
Division Version 1
• Each step: – Subtract divisor
• Depending on Result • Leave or Restore
– Depending on Result • Write '1' or
• Write '0‘ to quo(ent
CS 331 Xiaoyu Zhang, CSUSM
Example
• 7/2 or 0000 0111 / 0000 0010 iteration step Quotient Divisor Remainder
0 Initial values 0000 0010 0000 0000 0111
1 1. rem = rem - div 0000 0010 0000 1110 0111 2b: restore, sll Q, Q0= 0 0000 0010 0000 0000 0111 3: shift div right 0000 0001 0000 0000 0111
2 1. rem = rem - div 0000 0001 0000 1111 0111 2b: restore, sll Q, Q0= 0 0000 0001 0000 0000 0111 3: shift div right 0000 0000 1000 0000 0111
3 1. rem = rem - div 0000 0000 1000 1111 1111 2b: restore, sll Q, Q0= 0 0000 0000 1000 0000 0111 3: shift div right 0000 0000 0100 0000 0111
4 1. rem = rem - div 0000 0000 0100 0000 0011 2a. leave, sll Q, Q0=1 0001 0000 0100 0000 0011 3: shift div right 0001 0000 0010 0000 0011
5 1. rem = rem - div 0001 0000 0010 0000 0001 2a. leave, sll Q, Q0=1 0011 0000 0010 0000 0001 3: shift div right 0011 0000 0001 0000 0001
Op(mized Divider
• One cycle per par(al-‐remainder subtrac(on
• Looks a lot like a mul(plier! – Same hardware can be used for both
Division V 2
• Similar to mul(plica(on, the division hardware can be improved. – Use 32-‐bit ALU instead of 64-‐bit, shiH remainder to leH – switch order to shiH first and then subtract, can save 1 itera(on Reduc(on of Divisor
and ALU width by half
Improved Division
• Similar to mul(plica(on, the division hardware can be improved. – Remainder register keeps quo(ent. No quo(ent register required
Improved Divide Algorithm
– Start by shiHing the Remainder leH. – Only one shiH per loop – The consequence of combining the
two registers together and the new order of the opera(ons in the loop is that the remainder will shiHed leH one (me too many.
– Thus the final correc(on step must shiH back only the remainder in the leH half of the register
Example
• Well known numbers: 0000 0111 / 0010 iteration step Divisor Remainder
0 Initial values 0010 0000 0111 Shift rem left 0010 0000 1110
1 2: rem = rem - div 0010 1110 1110 3b: restore, sll R, R0 = 0 0010 0001 1100
2 2. rem = rem - div 0010 1111 1100 3b: restore, sll R, R0= 0 0010 0011 1000
3 2. rem = rem - div 0010 0001 1000 3a: leave, sll R, R0= 1 0010 0011 0001
4 2. rem = rem - div 0010 0001 0001 3a. leave, sll R, R0=1 0010 0010 0011 Shift left half of rem right 1 0010 0001 0011
Faster Division
• Can’t use parallel hardware as in mul(plier – Subtrac(on is condi(onal on sign of remainder
• Faster dividers (e.g. SRT devision) generate mul(ple quo(ent bits per step – S(ll require mul(ple steps
Signed division
• Keep the signs in mind for Dividend and Remainder – + 7 / + 2 = + 3 Remainder = +1
• 7 = 3 x 2 + (+1) = 6 + 1 – -‐ 7 / + 2 = -‐ 3 Remainder = -‐1
• -‐7 = -‐3 x 2 + (-‐1) = -‐ 6 -‐ 1 – + 7 / -‐ 2 = -‐ 3 Remainder = +1
– -‐ 7 / -‐ 2 = + 3 Remainder = -‐1
• One 64 bit register : Hi & Lo – Hi: Remainder, Lo: Quo(ent
• Divide by 0 / overflow : Check by soHware
MIPS Division
• Use HI/LO registers for result – HI: 32-‐bit remainder – LO: 32-‐bit quo(ent
• Instruc(ons – div rs, rt / divu rs, rt – No overflow or divide-‐by-‐0 checking
• SoHware must perform checks if required
– Use mfhi, mflo to access result – Pseudo code: div $t2, $t1, $t0
Floa(ng Point (a brief look)
• We need a way to represent – numbers with frac(ons, e.g., 3.1416
– very small numbers, e.g., .000000001
– very large numbers, e.g., 3.15576 × 109
• Representa(on:
– sign, exponent, significand: (–1)sign × significand × 2exponent
– more bits for significand gives more accuracy
– more bits for exponent increases range
• IEEE 754 floa(ng point standard:
– single precision: 8 bit exponent, 23 bit significand – double precision: 11 bit exponent, 52 bit significand
Floa(ng Point
• Representa(on for non-‐integral numbers – Including very small and very large numbers
• Like scien(fic nota(on – –2.34 × 1056 – +0.002 × 10–4 – +987.02 × 109
• In binary – ±1.xxxxxxx2 × 2yyyy
• Types float and double in C
normalized
not normalized
Floa(ng Point Standard
• Defined by IEEE Std 754-‐1985 • Developed in response to divergence of representa(ons – Portability issues for scien(fic code
• Now almost universally adopted
• Two representa(ons – Single precision (32-‐bit) – Double precision (64-‐bit)
Floa(ng point numbers
• Form – Arbitrary 363.4 • 1034 – Normalized 3.634 • 1036
• Binary nota(on – Normalized 1.xxx • 2yy
• Standard format IEEE 754 – Single precision 8 bit exp, 23 bit significand
• 2 • 10 -‐38 ... 2 • 1038 – Double precision 11 bit exp, 52 bit significand
• 2 • 10 -‐308 ... 2 • 10308
• Both formats are supported by MIPS
IEEE Floa(ng-‐Point Format
• S: sign bit (0 ⇒ non-‐nega(ve, 1 ⇒ nega(ve) • Normalize significand: 1.0 ≤ |significand| < 2.0
– Always has a leading pre-‐binary-‐point 1 bit, so no need to represent it explicitly (hidden bit)
– Significand is Frac(on with the “1.” restored • Exponent: excess representa(on: actual exponent + Bias
– Ensures exponent is unsigned – Single: Bias = 127; Double: Bias = 1203
S Exponent Frac(on
single: 8 bits double: 11 bits
single: 23 bits double: 52 bits
€
x = (−1)S × (1+Fraction) × 2(Exponent−Bias)
IEEE 754 floa(ng-‐point standard
• Leading “1” bit of significand is implicit – Saves one bit – Special case: 00…000 represents 0, no leading 1 is added.
• Exponent is “biased” to make sor(ng easier – all 0s is smallest exponent all 1s is largest – bias of 127 for single precision and 1023 for double precision
• summary: – (–1)sign ⋅ (1+frac(on) × 2exponent – bias
Example
• Show the binary representa(on of -‐0.75 in IEEE single precision format – Decimal representa(on: -‐0.75 = -‐ 3/4 – Binary representa(on: -‐ 0.11 = -‐ 1.1 • 2-‐1
• Floa(ng point – (-‐1)sign • (1 + frac(on) • 2exponent -‐ bias
• Sign bit = 1 – Significand = 1 + .1000 .... – Exponent = (-‐1 + 127) = 126 – 1 01111110 100 0000 0000 0000 0000 0000
• How to represent 5?
Floa(ng point Example
• What is the value of following IEEE binary floa(ng point number?
• 0 1000 0000 011 0000 0000 0000 0000 0000
• Which of the following floa(ng point numbers is larger?
• A = 1 00111111 11110000000000000000000 • B = 1 10001110 00010000000000000000101
Limita(ons
• Overflow: – The number is too big to be represented
• Underflow: – The number is too small to be represented
• Gradual underflow: – If the number gets smaller, the number of significant digits gets less – 1.234 • 10Emin
– 1.234 • 10Emin / 10 = 0.123 • 10Emin
– 0.123 • 10Emin / 10 = 0.012 • 10Emin
– 0.012 • 10Emin / 10 = 0.001 • 10Emin
– 0.001 • 10Emin / 10 = 0.000 • 10Emin
Floa(ng Point Complexi(es
• Opera(ons are somewhat more complicated • In addi(on to overflow we can have “underflow”
• Accuracy can be a big problem
– IEEE 754 keeps two extra bits, guard and round – four rounding modes
– posi(ve divided by zero yields “infinity” – zero divide by zero yields “not a number”
– other complexi(es
• Implemen(ng the standard can be tricky • Not using the standard can be even worse
– see text for descrip(on of 80x86 and Pen(um bug!
Single-‐Precision Range
• Exponents 00000000 and 11111111 reserved • Smallest value
– Exponent: 00000001 ⇒ actual exponent = 1 – 127 = –126
– Frac(on: 000…00 ⇒ significand = 1.0 – ±1.0 × 2–126 ≈ ±1.2 × 10–38
• Largest value – exponent: 11111110 ⇒ actual exponent = 254 – 127 = +127
– Frac(on: 111…11 ⇒ significand ≈ 2.0 – ±2.0 × 2+127 ≈ ±3.4 × 10+38
Double-‐Precision Range
• Exponents 0000…00 and 1111…11 reserved • Smallest value
– Exponent: 00000000001 ⇒ actual exponent = 1 – 1023 = –1022
– Frac(on: 000…00 ⇒ significand = 1.0 – ±1.0 × 2–1022 ≈ ±2.2 × 10–308
• Largest value – Exponent: 11111111110 ⇒ actual exponent = 2046 – 1023 = +1023
– Frac(on: 111…11 ⇒ significand ≈ 2.0 – ±2.0 × 2+1023 ≈ ±1.8 × 10+308
Floa(ng-‐Point Precision
• Rela(ve precision – all frac(on bits are significant – Single: approx 2–23
• Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision
– Double: approx 2–52 • Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision
Floa(ng-‐Point Example
• Represent –0.75 – –0.75 = (–1)1 × 1.12 × 2–1 – S = 1 – Frac(on = 1000…002 – Exponent = –1 + Bias
• Single: –1 + 127 = 126 = 011111102 • Double: –1 + 1023 = 1022 = 011111111102
• Single: 1011111101000…00 • Double: 1011111111101000…00
Floa(ng-‐Point Example
• What number is represented by the single-‐precision float
11000000101000…00 – S = 1 – Frac(on = 01000…002 – Fxponent = 100000012 = 129
• x = (–1)1 × (1 + 012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0
Denormal Numbers
• Exponent = 000...0 ⇒ hidden bit is 0
Smaller than normal numbers allow for gradual underflow, with diminishing precision
Denormal with frac(on = 000...0
Two representa(ons of 0.0!
Infini(es and NaNs
• Exponent = 111...1, Frac(on = 000...0 – ±Infinity – Can be used in subsequent calcula(ons, avoiding need for overflow check
• Exponent = 111...1, Frac(on ≠ 000...0 – Not-‐a-‐Number (NaN) – Indicates illegal or undefined result
• e.g., 0.0 / 0.0 – Can be used in subsequent calcula(ons
Floa(ng point addi(on
• Alignment • Addi(on of significands • Normaliza(on of the result
• Rounding • Example in decimal system precision 4 digits
– What is 9.999 • 101 + 1.610 • 10-‐1 ?
Floa(ng-‐Point Addi(on
• Now consider a 4-‐digit binary example – 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
• 1. Align binary points – ShiH number with smaller exponent – 1.0002 × 2–1 + –0.1112 × 2–1
• 2. Add significands – 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
• 3. Normalize result & check for over/underflow – 1.0002 × 2–4, with no over/underflow
• 4. Round and renormalize if necessary – 1.0002 × 2–4 (no change) = 0.0625
Example
• Aligning the two numbers 9.999 • 101 and 1.610 • 10-‐1 – 9.999 • 101
– 0.01610 • 101 Trunca(on – 0.016 • 101
• Addi(on – 9.999 • 101 – + 0.016 • 101 – 10.015 • 101
• Normaliza(on – 1.0015 • 102
• Rounding – 1.002 • 102
Example
• Add 0.5 and -‐0.4375 in binary • 0.5 = ½ = 1.0two x 2-‐1 • -‐0.4375 = -‐7/16 = -‐1.11two x 2-‐2 • Alignment: -‐1.11two x 2-‐2 = -‐0.111 x 2-‐1
• Add 1.0two x 2-‐1 + (-‐0.111 x 2-‐1) = 0.001x 2-‐1 • Normalize 1.000 x 2-‐4
• Round 1.000 x 2-‐4
FP Adder Hardware
• Much more complex than integer adder • Doing it in one clock cycle would take too long
– Much longer than integer opera(ons
– Slower clock would penalize all instruc(ons • FP adder usually takes several cycles
– Can be pipelined
Floa(ng-‐Point Mul(plica(on
• Consider a 4-‐digit decimal example – 1.110 × 1010 × 9.200 × 10–5
• 1. Add exponents – For biased exponents, subtract bias from sum – New exponent = 10 + –5 = 5
• 2. Mul(ply significands – 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105
• 3. Normalize result & check for over/underflow – 1.0212 × 106
• 4. Round and renormalize if necessary – 1.021 × 106
• 5. Determine sign of result from signs of operands – +1.021 × 106
Floa(ng-‐Point Mul(plica(on
• Now consider a 4-‐digit binary example – 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
• 1. Add exponents – Unbiased: –1 + –2 = –3 – Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
• 2. Mul(ply significands – 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3
• 3. Normalize result & check for over/underflow – 1.1102 × 2–3 (no change) with no over/underflow
• 4. Round and renormalize if necessary – 1.1102 × 2–3 (no change)
• 5. Determine sign: +ve × –ve ⇒ –ve – –1.1102 × 2–3 = –0.21875
Floa(ng point mul(plica(on
• Composi(on of number from different parts -> separate handling
(s1 • 2e1) • (s2 • 2e2) = (s1 • s2) • 2 e1+ e2
• Add exponents • Mul(ply the significands • Normalize • Over-‐ underflow • Rounding
• Sign bit
Example
• 3 x 5 • 3 = 11 = 1.1 x 21, 5 = 101 = 1.01 x 22 • Exponent 1+2 = 3 • Significant 1.1 x 1.01 = 1.111 • Result 1.111 x 23 = 1111 = 15
Floa(ng Point Mul(plica(on
• Example – 1 10000010 00000.... = -‐1 • 23 – 0 10000011 00000.... = 1 • 24
• Both significands are 1 -> the significant of product = 1 • Add the exponents, bias = 127
– 10000010 – 10000011 – -‐01111111 correc(on – 10000110 = 134 => 134 -‐ 127 = 7
• Result: – 1 10000110 00000…
FP Arithme(c Hardware
• FP mul(plier is of similar complexity to FP adder – But uses a mul(plier for significands instead of an adder
• FP arithme(c hardware usually does – Addi(on, subtrac(on, mul(plica(on, division, reciprocal, square-‐root
– FP ↔ integer conversion • Opera(ons usually takes several cycles
– Can be pipelined
Floa(ng point mul(ply/divide
Sign comparison and XOR
Exponent add or subtract
Man(ssa mul(ply Or divide
Reassembly
Postnormalizing shiHs
Rounding is Done only here
Exponent adjust
Full precision Is kept here
Floa(ng-‐Point Instruc(ons in MIPS
• Flo(ng-‐point instruc(ons: – addi(on: add.s, add.d – subtrac(on: sub.s, sub.d – mul(plica(on: mul.s, mul.d – division: div.s, div.d
• Floa(ng-‐point instruc(ons use separate floa(ng-‐point registers. MIPS has separate load and store for floa(ng-‐point registers.
FP Instruc(ons in MIPS
• Single-‐precision arithme(c – add.s, sub.s, mul.s, div.s
• e.g., add.s $f0, $f1, $f6 • Double-‐precision arithme(c
– add.d, sub.d, mul.d, div.d • e.g., mul.d $f4, $f4, $f6
• Single-‐ and double-‐precision comparison – c.xx.s, c.xx.d (xx is eq, lt, le, …) – Sets or clears FP condi(on-‐code bit
• e.g. c.lt.s $f3, $f4 • Branch on FP condi(on code true or false
– bc1t, bc1f • e.g., bc1t TargetLabel
FP Example: °F to °C
• C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); } – fahr in $f12, result in $f0, literals in global memory space
• Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc2 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra
FP Example: Array Mul(plica(on
• X = X + Y × Z – All 32 × 32 matrices, 64-‐bit double-‐precision elements
• C code: void mm (double x[][],
double y[][], double z[][]) { int i, j, k; for (i = 0; i! = 32; i = i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j]; } – Addresses of x, y, z in $a0, $a1, $a2, and i, j, k in $s0, $s1, $s2
FP Example: Array Mul(plica(on
MIPS code: li $t1, 32 # $t1 = 32 (row size/loop end)
li $s0, 0 # i = 0; initialize 1st for loop
L1: li $s1, 0 # j = 0; restart 2nd for loop
L2: li $s2, 0 # k = 0; restart 3rd for loop
sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)
addu $t2, $t2, $s1 # $t2 = i * size(row) + j
sll $t2, $t2, 3 # $t2 = byte offset of [i][j]
addu $t2, $a0, $t2 # $t2 = byte address of x[i][j]
l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]
L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z)
addu $t0, $t0, $s1 # $t0 = k * size(row) + j
sll $t0, $t0, 3 # $t0 = byte offset of [k][j]
addu $t0, $a2, $t0 # $t0 = byte address of z[k][j]
l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j]
…
FP Example: Array Mul(plica(on …
sll $t0, $s0, 5 # $t0 = i*32 (size of row of y)
addu $t0, $t0, $s2 # $t0 = i*size(row) + k
sll $t0, $t0, 3 # $t0 = byte offset of [i][k]
addu $t0, $a1, $t0 # $t0 = byte address of y[i][k]
l.d $f18, 0($t0) # $f18 = 8 bytes of y[i][k]
mul.d $f16, $f18, $f16 # $f16 = y[i][k] * z[k][j]
add.d $f4, $f4, $f16 # f4=x[i][j] + y[i][k]*z[k][j]
addiu $s2, $s2, 1 # $k k + 1
bne $s2, $t1, L3 # if (k != 32) go to L3
s.d $f4, 0($t2) # x[i][j] = $f4
addiu $s1, $s1, 1 # $j = j + 1
bne $s1, $t1, L2 # if (j != 32) go to L2
addiu $s0, $s0, 1 # $i = i + 1
bne $s0, $t1, L1 # if (i != 32) go to L1
Accurate Arithme(c
• IEEE Std 754 specifies addi(onal rounding control – Extra bits of precision (guard, round, s(cky) – Choice of rounding modes – Allows programmer to fine-‐tune numerical behavior of a computa(on
• Not all FP units implement all op(ons – Most programming languages and FP libraries just use defaults
• Trade-‐off between hardware complexity, performance, and market requirements
Interpreta(on of Data
• Bits have no inherent meaning – Interpreta(on depends on the instruc(ons applied
• Computer representa(ons of numbers – Finite range and precision – Need to account for this in programs
Associa(vity • Parallel programs may interleave opera(ons in unexpected orders – Assump(ons of associa(vity may fail
Need to validate parallel programs under varying degrees of parallelism
x86 FP Architecture
• Originally based on 8087 FP coprocessor – 8 × 80-‐bit extended-‐precision registers – Used as a push-‐down stack – Registers indexed from TOS: ST(0), ST(1), …
• FP values are 32-‐bit or 64 in memory – Converted on load/store of memory operand – Integer operands can also be converted on load/store
• Very difficult to generate and op(mize code – Result: poor FP performance
x86 FP Instruc(ons
• Op(onal varia(ons – I: integer operand – P: pop operand from stack – R: reverse operand order – But not all combina(ons allowed
Data transfer Arithmetic Compare Transcendental FILD mem/ST(i)
FISTP mem/ST(i)
FLDPI
FLD1
FLDZ
FIADDP mem/ST(i)
FISUBRP mem/ST(i) FIMULP mem/ST(i) FIDIVRP mem/ST(i)
FSQRT
FABS
FRNDINT
FICOMP
FIUCOMP
FSTSW AX/mem
FPATAN
F2XMI
FCOS
FPTAN
FPREM
FPSIN
FYL2X
Streaming SIMD Extension 2 (SSE2)
• Adds 4 × 128-‐bit registers – Extended to 8 registers in AMD64/EM64T
• Can be used for mul(ple FP operands – 2 × 64-‐bit double precision – 4 × 32-‐bit double precision – Instruc(ons operate on them simultaneously
• Single-‐Instruc(on Mul(ple-‐Data
Right ShiH and Division
• LeH shiH by i places mul(plies an integer by 2i • Right shiH divides by 2i?
– Only for unsigned integers • For signed integers
– Arithme(c right shiH: replicate the sign bit – e.g., –5 / 4
• 111110112 >> 2 = 111111102 = –2 • Rounds toward –∞
– c.f. 111110112 >>> 2 = 001111102 = +62
Who Cares About FP Accuracy?
• Important for scien(fic code – But for everyday consumer use?
• “My bank balance is out by 0.0002¢!”
• The Intel Pen(um FDIV bug – The market expects accuracy
– See Colwell, The Pen*um Chronicles
Summary
• Computer arithme(c is constrained by limited precision • Bit pa@erns have no inherent meaning but standards do exist
– two’s complement – IEEE 754 floa(ng point
• Computer instruc(ons determine “meaning” of the bit pa@erns
• Performance and accuracy are important so there are many complexi(es in real machines (i.e., algorithms and implementa(on).
Concluding Remarks
• ISAs support arithme(c – Signed and unsigned integers – Floa(ng-‐point approxima(on to reals
• Bounded range and precision – Opera(ons can overflow and underflow
• MIPS ISA – Core instruc(ons: 54 most frequently used
• 100% of SPECINT, 97% of SPECFP – Other instruc(ons: less frequent