Arithme(cForComputers - UISwiki.sc3.uis.edu.co/images/a/a6/ArithArchi.pdf · Product 1001000 ... • Looksalotlikeamul(plier!* ... andALUwidthbyhalf** ImprovedDivision •

Arithme(c For Computers

Computer Architect and Engineer View

Disclaimer

• Some of these slides are made by Hennesy and Pa@erson from Computer Architecture a Quan(a(ve approach and Computer Organiza(on and Design, The Hardware/ SoHware Interface.

WARNING!!

• You should review the mathema(cal binary, hexadecimal, decimal and octagonal. Also the transforma(on between these.

• This session is ONLY to understand machine arithme(c from architecture point of view.

Arithme(c

• Numbers and number representa(on • Addi(on and Subtrac(on • Mul(plica(on

• Division • Floa(ng point arithme(c

Numbers and their representa(on

• Representa(on – ASCII -‐ text characters

• Easy read and write of numbers • Complex arithme(c (character wise)

– Binary number • Natural form for computers • Requires formaYng rou(nes for I/O

• in MIPS (32-‐bit number): – Least significant bit is right (bit 0) – Most significant bit is leH (bit 31)

Number types

• Integer numbers, unsigned – Address calcula(ons – Numbers that can only be posi(ve

• Signed numbers – Posi(ve – Nega(ve

• Floa(ng point numbers – numeric calcula(ons – Different grades of precision

• Singe precision (IEEE) • Double precision (IEEE)

Number representa(ons

• Bits are just bits (no inherent meaning) – Conven(ons / contexts define rela(onship between bits and

numbers

• Binary numbers (base 2). – E.g. 1011

• Things may become complicated. – Numbers are finite (overflow) – Frac(on and real numbers – Nega(ve numbers

• How do we represent nega(ve numbers?

Binary and Bits

• Computer Words are composed of bits, thus, words can be represented as binary numbers.

0 0 0 0 numerical value 110 1 0 0 0 numerical value 210 0 1 0 0 numerical value 310 1 1 0 0 numerical value 410

8, 16, 32 bits Representa(on

The 710 number: • 8-bits: 0000 0111two

• 16-bits: 0000 0000 0000 0111two

• 32-bits: 0000 0000 0000 0000 0000 0000 0000 0111two

Possible Representa(ons

• Sign and Magnitude – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐0, 101 = -‐1, 110 = -‐2, 111 = -‐3

• One’s Complement – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐3, 101 = -‐2, 110 = -‐1, 111 = -‐0

• Two’s complement – 000 = 0, 001 = 1, 010 = 2, 011 = 3 – 100 = -‐4, 101 = -‐3, 110 = -‐2, 111 = -‐1 – How to evaluate?

2's complement

Only one representa(on for 0

One more nega(ve number than posi(ve number

0000

0111

0011

1011

1111 1110

1101

1100

1010

1001 1000

0110

0101

0100

0010

0001

+0 +1

+2

+3

+4

+5 +6

+7 -8 -7

-6

-5

-4

-3 -2

-1

0 100 = + 4

1 100 = - 4

+

-

Addi(on and Substrac(on

• Addi(on: Digits are added bit by bit from right to leH, with carries passed to the next digit to the leH, just as you would do by hand.

• Substrac(on: Uses Addi(on. The appropiate operand is simple negated before being added.

• Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110 - 0110 - 0101

• Two's complement opera(ons easy

– subtrac(on using addi(on of negated numbers 0111 + 1010

• Overflow (result too large for finite computer word):

– e.g., adding two n-‐bit numbers does not yield an n-‐bit number 0111 + 0001 note that overflow term is somewhat misleading, 1000 it does not mean a carry “overflowed”

Addi(on and Subtrac(on

Overflow

• Overflow occurs when the result from an opera(on cannot be represented with the available hardware.

Overflow

• The sum of two unsigned numbers can exceed any representa(on

• The opera(on on 2’s complement numbers can exceed any representa(on

•

1000 0001

1111 1110

0111 1111

-127

-2

+127

+

1111 1111

1111 1010

11111 1001

255

250

249

+

Overflow

• General overflow condi(ons of 2’s complement numbers

• No overflow when adding a posi(ve and a nega(ve number • No overflow when signs are the same for subtrac(on • Consider the opera(ons A + B, and A – B

– Can overflow occur if B is 0 ? – Can overflow occur if A is 0 ?

A + B A + B A - B A - B

A > 0 A < 0 A > 0 A < 0

B > 0 B < 0 B < 0 B > 0

< 0 > 0 < 0 > 0

Operation R esult Conditions

Overflow

• Hardware detec(on in the ALU • Genera(on of an excep(on (interrupt) • Interrupted address is saved for possible resump(on

• Jump to predefined address based on type of excep(on – Correct & return to program – Return to program with error code – Abort program

Overflow

• Signed integers cause overflow – add – add immediate (addi) – subtract (sub)

• Overflow in unsigned integers: not considered as serious in MIPS – add unsigned (addu) – add immediate unsigned (addiu) – Subtract unsigned (subu) – Handle with care!

Overflow

• Excep(on is called interrupt: an unscheduled event that disrupt program execu(on, used to detect overflow.

• Interrupt: An excep(on that comes from outside of the process.

Integer Addi(on • Example: 7 + 6

Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands

Overflow if result sign is 1

Adding two –ve operands Overflow if result sign is 0

Integer Subtrac(on

• Add nega(on of second operand • Example: 7 – 6 = 7 + (–6)

+7: 0000 0000 … 0000 0111 –6: 1111 1111 … 1111 1010 +1: 0000 0000 … 0000 0001

• Overflow if result out of range – Subtrac(ng two +ve or two –ve operands, no overflow – Subtrac(ng +ve from –ve operand

• Overflow if result sign is 0 – Subtrac(ng –ve from +ve operand

• Overflow if result sign is 1

Dealing with Overflow

• Some languages (e.g., C) ignore overflow – Use MIPS addu, addui, subu instruc(ons

• Other languages (e.g., Ada, Fortran) require raising an excep(on – Use MIPS add, addi, sub instruc(ons – On overflow, invoke excep(on handler

• Save PC in excep(on program counter (EPC) register • Jump to predefined handler address • mfc0 (move from coprocessor reg) instruc(on can retrieve EPC value, to return aHer correc(ve ac(on

Arithme(c for Mul(media

• Graphics and media processing operates on vectors of 8-‐bit and 16-‐bit data – Use 64-‐bit adder, with par((oned carry chain

• Operate on 8×8-‐bit, 4×16-‐bit, or 2×32-‐bit vectors – SIMD (single-‐instruc(on, mul(ple-‐data)

• Satura(ng opera(ons – On overflow, result is largest representable value

• c.f. 2s-‐complement modulo arithme(c

– E.g., clipping in audio, satura(on in video

(First View of Flynn’s Taxonomy*)

* Proposed by M. Flynn in 1966

• More complicated than addi(on – accomplished via shiHing and addi(on

• More (me and more space

• Let's look at 3 versions based on grade school algorithm

1000 (multiplicand) __x_1001 (multiplier)

• See three revisions of the implementa(on

Mul(plica(on

Mul(plica(on

• Basic func(ons: – Mul(plica(on of the mul(plicand with the last digit of the mul(plier, with the next digit, ...

– Adding up the par(al products • Further simplifica(on in binary system:

– ShiH – Either add mul(plicand or add zero.

Mul(plica(on

• Binary mul(plica(on Multiplicand * Multiplier 1000 * 1001 1000 0000 0000 1000 . Product 1001000

• Look at current bit posi(on of the mul(plier – If mul(plier is 1

• then add mul(plicand • Else add 0

– shiH mul(plicand to the leH by 1 bit

Mul(plica(on • Start with long-‐mul(plica(on approach

1000 × 1001 1000 0000 0000 1000 1001000

Length of product is the sum of operand lengths

mul(plicand

mul(plier

product

Mul(plica(on Version 1

• 32 bits: mul(plier • 64 bits: mul(plicand, product, ALU

Mul(plica(on: Implementa(on

• Check mul(plier to determine whether the mul(plicand is added to the product register

• LeH shiH the mul(plicand register 1 bit • Right shiH the mul(plier register 1 bit • Very big too slow!

• Example 2x3

• Product Mul(plier Mul(plicand 0000 0000 0011 0000 0010

• 0000 0010 0001 0000 0100 • 0000 0110 0000 0000 1000

• 0000 0110

Mul(plica(on Hardware

Initially 0

Second Version

• If mul(plier0 = 1, add mul(plicand to the leH 32 bit of the product register

• ShiH the product register right 1 bit

• ShiH the mul(plier register right 1 bit

• Example

Op(mized Mul(plier • Perform steps in parallel: add/shiH

One cycle per par(al-‐product addi(on That’s ok, if frequency of mul(plica(ons is low

Revised 4-‐bit example

• 2 x 3 or 0010 x 0011

Final Version

• Further op*miza*on • At the ini*al state the product

register contains only '0'

• The lower 32 bits are simply shi>ed out

• Idea: – use these 32 bits for the

mul*plier and check the lowest bit of the product register

– if add & shi> or shi> only

Final Version

• If product0 = 1, add mul(plicand to the leH 32 bit of the product register

• ShiH the product register right 1 bit

Example

• 2x3 or 0010 x 0011

Faster Mul(plier • Uses mul(ple adders

– Cost/performance tradeoff

Can be pipelined Several mul(plica(on performed in parallel

Signed Mul(plica(on

• Basic approach: – Store the signs of the operands – Convert signed numbers to unsigned numbers (most significant bit (MSB) = 0)

– Perform mul(plica(on

– If sign bits of operands as equal • sign bit = 0, else • Negate the product

MIPS Mul(plica(on

• Two 32-‐bit registers for product – HI: most-‐significant 32 bits – LO: least-‐significant 32-‐bits

• Instruc(ons – mult rs, rt / multu rs, rt

• 64-‐bit product in HI/LO – mfhi rd / mflo rd

• Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits

– mul rd, rs, rt • Least-‐significant 32 bits of product –> rd

Division • Check for 0 divisor • Long division approach

– If divisor ≤ dividend bits • 1 bit in quo(ent, subtract

– Otherwise • 0 bit in quo(ent, bring down next dividend bit

• Restoring division – Do the subtract, and if remainder goes <

0, add divisor back

• Signed division – Divide using absolute values – Adjust sign of quo(ent and remainder as

required

1001 1000 1001010 -1000 10 101 1010 -1000 10

n-‐bit operands yield n-‐bit quo(ent and remainder

quo(ent

dividend

remainder

divisor

Division Version 1

• 64-‐bit Divisor reg, 64-‐bit ALU, 64-‐bit Remainder reg, 32-‐bit Quo(ent reg

Remainder

Quotient

Divisor

64-bit ALU

Shift Right

Shift Left

Write Control

32 bits

64 bits

64 bits

Division Version 1

• Each step: – Subtract divisor

• Depending on Result • Leave or Restore

– Depending on Result • Write '1' or

• Write '0‘ to quo(ent

Division Hardware

Ini(ally dividend

Ini(ally divisor in leH half

CS 331 Xiaoyu Zhang, CSUSM

Example

• 7/2 or 0000 0111 / 0000 0010 iteration step Quotient Divisor Remainder

0 Initial values 0000 0010 0000 0000 0111

1 1. rem = rem - div 0000 0010 0000 1110 0111 2b: restore, sll Q, Q0= 0 0000 0010 0000 0000 0111 3: shift div right 0000 0001 0000 0000 0111



4 1. rem = rem - div 0000 0000 0100 0000 0011 2a. leave, sll Q, Q0=1 0001 0000 0100 0000 0011 3: shift div right 0001 0000 0010 0000 0011

5 1. rem = rem - div 0001 0000 0010 0000 0001 2a. leave, sll Q, Q0=1 0011 0000 0010 0000 0001 3: shift div right 0011 0000 0001 0000 0001

Op(mized Divider

• One cycle per par(al-‐remainder subtrac(on

• Looks a lot like a mul(plier! – Same hardware can be used for both

Division V 2

• Similar to mul(plica(on, the division hardware can be improved. – Use 32-‐bit ALU instead of 64-‐bit, shiH remainder to leH – switch order to shiH first and then subtract, can save 1 itera(on Reduc(on of Divisor

and ALU width by half

Improved Division

• Similar to mul(plica(on, the division hardware can be improved. – Remainder register keeps quo(ent. No quo(ent register required

Improved Divide Algorithm

– Start by shiHing the Remainder leH. – Only one shiH per loop – The consequence of combining the

two registers together and the new order of the opera(ons in the loop is that the remainder will shiHed leH one (me too many.

– Thus the final correc(on step must shiH back only the remainder in the leH half of the register

Example

• Well known numbers: 0000 0111 / 0010 iteration step Divisor Remainder

0 Initial values 0010 0000 0111 Shift rem left 0010 0000 1110

1 2: rem = rem - div 0010 1110 1110 3b: restore, sll R, R0 = 0 0010 0001 1100

2 2. rem = rem - div 0010 1111 1100 3b: restore, sll R, R0= 0 0010 0011 1000

3 2. rem = rem - div 0010 0001 1000 3a: leave, sll R, R0= 1 0010 0011 0001

4 2. rem = rem - div 0010 0001 0001 3a. leave, sll R, R0=1 0010 0010 0011 Shift left half of rem right 1 0010 0001 0011

Faster Division

• Can’t use parallel hardware as in mul(plier – Subtrac(on is condi(onal on sign of remainder

• Faster dividers (e.g. SRT devision) generate mul(ple quo(ent bits per step – S(ll require mul(ple steps

Signed division

• Keep the signs in mind for Dividend and Remainder – + 7 / + 2 = + 3 Remainder = +1

• 7 = 3 x 2 + (+1) = 6 + 1 – -‐ 7 / + 2 = -‐ 3 Remainder = -‐1

• -‐7 = -‐3 x 2 + (-‐1) = -‐ 6 -‐ 1 – + 7 / -‐ 2 = -‐ 3 Remainder = +1

– -‐ 7 / -‐ 2 = + 3 Remainder = -‐1

• One 64 bit register : Hi & Lo – Hi: Remainder, Lo: Quo(ent

• Divide by 0 / overflow : Check by soHware

MIPS Division

• Use HI/LO registers for result – HI: 32-‐bit remainder – LO: 32-‐bit quo(ent

• Instruc(ons – div rs, rt / divu rs, rt – No overflow or divide-‐by-‐0 checking

• SoHware must perform checks if required

– Use mfhi, mflo to access result – Pseudo code: div $t2, $t1, $t0

Floa(ng Point (a brief look)

• We need a way to represent – numbers with frac(ons, e.g., 3.1416

– very small numbers, e.g., .000000001

– very large numbers, e.g., 3.15576 × 109

• Representa(on:

– sign, exponent, significand: (–1)sign × significand × 2exponent

– more bits for significand gives more accuracy

– more bits for exponent increases range

• IEEE 754 floa(ng point standard:

– single precision: 8 bit exponent, 23 bit significand – double precision: 11 bit exponent, 52 bit significand

Floa(ng Point

• Representa(on for non-‐integral numbers – Including very small and very large numbers

• Like scien(fic nota(on – –2.34 × 1056 – +0.002 × 10–4 – +987.02 × 109

• In binary – ±1.xxxxxxx2 × 2yyyy

• Types float and double in C

normalized

not normalized

Floa(ng Point Standard

• Defined by IEEE Std 754-‐1985 • Developed in response to divergence of representa(ons – Portability issues for scien(fic code

• Now almost universally adopted

• Two representa(ons – Single precision (32-‐bit) – Double precision (64-‐bit)

Floa(ng point numbers

• Form – Arbitrary 363.4 • 1034 – Normalized 3.634 • 1036

• Binary nota(on – Normalized 1.xxx • 2yy

• Standard format IEEE 754 – Single precision 8 bit exp, 23 bit significand

• 2 • 10 -‐38 ... 2 • 1038 – Double precision 11 bit exp, 52 bit significand

• 2 • 10 -‐308 ... 2 • 10308

• Both formats are supported by MIPS

IEEE Floa(ng-‐Point Format

• S: sign bit (0 ⇒ non-‐nega(ve, 1 ⇒ nega(ve) • Normalize significand: 1.0 ≤ |significand| < 2.0

– Always has a leading pre-‐binary-‐point 1 bit, so no need to represent it explicitly (hidden bit)

– Significand is Frac(on with the “1.” restored • Exponent: excess representa(on: actual exponent + Bias

– Ensures exponent is unsigned – Single: Bias = 127; Double: Bias = 1203

S Exponent Frac(on

single: 8 bits double: 11 bits

single: 23 bits double: 52 bits

€

x = (−1)S × (1+Fraction) × 2(Exponent−Bias)

IEEE 754 floa(ng-‐point standard

• Leading “1” bit of significand is implicit – Saves one bit – Special case: 00…000 represents 0, no leading 1 is added.

• Exponent is “biased” to make sor(ng easier – all 0s is smallest exponent all 1s is largest – bias of 127 for single precision and 1023 for double precision

• summary: – (–1)sign ⋅ (1+frac(on) × 2exponent – bias

Example

• Show the binary representa(on of -‐0.75 in IEEE single precision format – Decimal representa(on: -‐0.75 = -‐ 3/4 – Binary representa(on: -‐ 0.11 = -‐ 1.1 • 2-‐1

• Floa(ng point – (-‐1)sign • (1 + frac(on) • 2exponent -‐ bias

• Sign bit = 1 – Significand = 1 + .1000 .... – Exponent = (-‐1 + 127) = 126 – 1 01111110 100 0000 0000 0000 0000 0000

• How to represent 5?

Floa(ng point Example

• What is the value of following IEEE binary floa(ng point number?

• 0 1000 0000 011 0000 0000 0000 0000 0000

• Which of the following floa(ng point numbers is larger?

• A = 1 00111111 11110000000000000000000 • B = 1 10001110 00010000000000000000101

Limita(ons

• Overflow: – The number is too big to be represented

• Underflow: – The number is too small to be represented

• Gradual underflow: – If the number gets smaller, the number of significant digits gets less – 1.234 • 10Emin

– 1.234 • 10Emin / 10 = 0.123 • 10Emin

– 0.123 • 10Emin / 10 = 0.012 • 10Emin

– 0.012 • 10Emin / 10 = 0.001 • 10Emin

– 0.001 • 10Emin / 10 = 0.000 • 10Emin

Floa(ng Point Complexi(es

• Opera(ons are somewhat more complicated • In addi(on to overflow we can have “underflow”

• Accuracy can be a big problem

– IEEE 754 keeps two extra bits, guard and round – four rounding modes

– posi(ve divided by zero yields “infinity” – zero divide by zero yields “not a number”

– other complexi(es

• Implemen(ng the standard can be tricky • Not using the standard can be even worse

– see text for descrip(on of 80x86 and Pen(um bug!

Singles, Doubles, Floa(ng, Infini(es, Denormals and NaNs

Single-‐Precision Range

• Exponents 00000000 and 11111111 reserved • Smallest value

– Exponent: 00000001 ⇒ actual exponent = 1 – 127 = –126

– Frac(on: 000…00 ⇒ significand = 1.0 – ±1.0 × 2–126 ≈ ±1.2 × 10–38

• Largest value – exponent: 11111110 ⇒ actual exponent = 254 – 127 = +127

– Frac(on: 111…11 ⇒ significand ≈ 2.0 – ±2.0 × 2+127 ≈ ±3.4 × 10+38

Double-‐Precision Range

• Exponents 0000…00 and 1111…11 reserved • Smallest value

– Exponent: 00000000001 ⇒ actual exponent = 1 – 1023 = –1022

– Frac(on: 000…00 ⇒ significand = 1.0 – ±1.0 × 2–1022 ≈ ±2.2 × 10–308

• Largest value – Exponent: 11111111110 ⇒ actual exponent = 2046 – 1023 = +1023

– Frac(on: 111…11 ⇒ significand ≈ 2.0 – ±2.0 × 2+1023 ≈ ±1.8 × 10+308

Floa(ng-‐Point Precision

• Rela(ve precision – all frac(on bits are significant – Single: approx 2–23

• Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision

– Double: approx 2–52 • Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision

Floa(ng-‐Point Example

• Represent –0.75 – –0.75 = (–1)1 × 1.12 × 2–1 – S = 1 – Frac(on = 1000…002 – Exponent = –1 + Bias

• Single: –1 + 127 = 126 = 011111102 • Double: –1 + 1023 = 1022 = 011111111102

• Single: 1011111101000…00 • Double: 1011111111101000…00

Floa(ng-‐Point Example

• What number is represented by the single-‐precision float

11000000101000…00 – S = 1 – Frac(on = 01000…002 – Fxponent = 100000012 = 129

• x = (–1)1 × (1 + 012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0

Denormal Numbers

• Exponent = 000...0 ⇒ hidden bit is 0

Smaller than normal numbers allow for gradual underflow, with diminishing precision

Denormal with frac(on = 000...0

Two representa(ons of 0.0!

Infini(es and NaNs

• Exponent = 111...1, Frac(on = 000...0 – ±Infinity – Can be used in subsequent calcula(ons, avoiding need for overflow check

• Exponent = 111...1, Frac(on ≠ 000...0 – Not-‐a-‐Number (NaN) – Indicates illegal or undefined result

• e.g., 0.0 / 0.0 – Can be used in subsequent calcula(ons

Floa(ng point addi(on

• Alignment • Addi(on of significands • Normaliza(on of the result

• Rounding • Example in decimal system precision 4 digits

– What is 9.999 • 101 + 1.610 • 10-‐1 ?

Floa(ng-‐Point Addi(on

• Now consider a 4-‐digit binary example – 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)

• 1. Align binary points – ShiH number with smaller exponent – 1.0002 × 2–1 + –0.1112 × 2–1

• 2. Add significands – 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1

• 3. Normalize result & check for over/underflow – 1.0002 × 2–4, with no over/underflow

• 4. Round and renormalize if necessary – 1.0002 × 2–4 (no change) = 0.0625

Example

• Aligning the two numbers 9.999 • 101 and 1.610 • 10-‐1 – 9.999 • 101

– 0.01610 • 101 Trunca(on – 0.016 • 101

• Addi(on – 9.999 • 101 – + 0.016 • 101 – 10.015 • 101

• Normaliza(on – 1.0015 • 102

• Rounding – 1.002 • 102

Example

• Add 0.5 and -‐0.4375 in binary • 0.5 = ½ = 1.0two x 2-‐1 • -‐0.4375 = -‐7/16 = -‐1.11two x 2-‐2 • Alignment: -‐1.11two x 2-‐2 = -‐0.111 x 2-‐1

• Add 1.0two x 2-‐1 + (-‐0.111 x 2-‐1) = 0.001x 2-‐1 • Normalize 1.000 x 2-‐4

• Round 1.000 x 2-‐4

FP Adder Hardware

• Much more complex than integer adder • Doing it in one clock cycle would take too long

– Much longer than integer opera(ons

– Slower clock would penalize all instruc(ons • FP adder usually takes several cycles

– Can be pipelined

FP Adder Hardware

Step 1

Step 2

Step 3

Step 4

Floa(ng point addi(on

Floa(ng-‐Point Mul(plica(on

• Consider a 4-‐digit decimal example – 1.110 × 1010 × 9.200 × 10–5

• 1. Add exponents – For biased exponents, subtract bias from sum – New exponent = 10 + –5 = 5

• 2. Mul(ply significands – 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105

• 3. Normalize result & check for over/underflow – 1.0212 × 106

• 4. Round and renormalize if necessary – 1.021 × 106

• 5. Determine sign of result from signs of operands – +1.021 × 106

Floa(ng-‐Point Mul(plica(on

• Now consider a 4-‐digit binary example – 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)

• 1. Add exponents – Unbiased: –1 + –2 = –3 – Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127

• 2. Mul(ply significands – 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3

• 3. Normalize result & check for over/underflow – 1.1102 × 2–3 (no change) with no over/underflow

• 4. Round and renormalize if necessary – 1.1102 × 2–3 (no change)

• 5. Determine sign: +ve × –ve ⇒ –ve – –1.1102 × 2–3 = –0.21875

Floa(ng point mul(plica(on

• Composi(on of number from different parts -> separate handling

(s1 • 2e1) • (s2 • 2e2) = (s1 • s2) • 2 e1+ e2

• Add exponents • Mul(ply the significands • Normalize • Over-‐ underflow • Rounding

• Sign bit

Example

• 3 x 5 • 3 = 11 = 1.1 x 21, 5 = 101 = 1.01 x 22 • Exponent 1+2 = 3 • Significant 1.1 x 1.01 = 1.111 • Result 1.111 x 23 = 1111 = 15

Floa(ng Point Mul(plica(on

• Example – 1 10000010 00000.... = -‐1 • 23 – 0 10000011 00000.... = 1 • 24

• Both significands are 1 -> the significant of product = 1 • Add the exponents, bias = 127

– 10000010 – 10000011 – -‐01111111 correc(on – 10000110 = 134 => 134 -‐ 127 = 7

• Result: – 1 10000110 00000…

FP Arithme(c Hardware

• FP mul(plier is of similar complexity to FP adder – But uses a mul(plier for significands instead of an adder

• FP arithme(c hardware usually does – Addi(on, subtrac(on, mul(plica(on, division, reciprocal, square-‐root

– FP ↔ integer conversion • Opera(ons usually takes several cycles

– Can be pipelined

Floa(ng point mul(ply/divide

Sign comparison and XOR

Exponent add or subtract

Man(ssa mul(ply Or divide

Reassembly

Postnormalizing shiHs

Rounding is Done only here

Exponent adjust

Full precision Is kept here

Floa(ng-‐Point Instruc(ons in MIPS

• Flo(ng-‐point instruc(ons: – addi(on: add.s, add.d – subtrac(on: sub.s, sub.d – mul(plica(on: mul.s, mul.d – division: div.s, div.d

• Floa(ng-‐point instruc(ons use separate floa(ng-‐point registers. MIPS has separate load and store for floa(ng-‐point registers.

FP Instruc(ons in MIPS

• Single-‐precision arithme(c – add.s, sub.s, mul.s, div.s

• e.g., add.s $f0, $f1, $f6 • Double-‐precision arithme(c

– add.d, sub.d, mul.d, div.d • e.g., mul.d $f4, $f4, $f6

• Single-‐ and double-‐precision comparison – c.xx.s, c.xx.d (xx is eq, lt, le, …) – Sets or clears FP condi(on-‐code bit

• e.g. c.lt.s $f3, $f4 • Branch on FP condi(on code true or false

– bc1t, bc1f • e.g., bc1t TargetLabel

FP Example: °F to °C

• C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); } – fahr in $f12, result in $f0, literals in global memory space

• Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc2 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra

FP Example: Array Mul(plica(on

• X = X + Y × Z – All 32 × 32 matrices, 64-‐bit double-‐precision elements

• C code: void mm (double x[][],

double y[][], double z[][]) { int i, j, k; for (i = 0; i! = 32; i = i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j]; } – Addresses of x, y, z in $a0, $a1, $a2, and i, j, k in $s0, $s1, $s2

FP Example: Array Mul(plica(on

MIPS code: li $t1, 32 # $t1 = 32 (row size/loop end)

li $s0, 0 # i = 0; initialize 1st for loop

L1: li $s1, 0 # j = 0; restart 2nd for loop

L2: li $s2, 0 # k = 0; restart 3rd for loop

sll $t2, $s0, 5 # $t2 = i * 32 (size of row of x)

addu $t2, $t2, $s1 # $t2 = i * size(row) + j

sll $t2, $t2, 3 # $t2 = byte offset of [i][j]

addu $t2, $a0, $t2 # $t2 = byte address of x[i][j]

l.d $f4, 0($t2) # $f4 = 8 bytes of x[i][j]

L3: sll $t0, $s2, 5 # $t0 = k * 32 (size of row of z)

addu $t0, $t0, $s1 # $t0 = k * size(row) + j

sll $t0, $t0, 3 # $t0 = byte offset of [k][j]

addu $t0, $a2, $t0 # $t0 = byte address of z[k][j]

l.d $f16, 0($t0) # $f16 = 8 bytes of z[k][j]

…

FP Example: Array Mul(plica(on …

sll $t0, $s0, 5 # $t0 = i*32 (size of row of y)

addu $t0, $t0, $s2 # $t0 = i*size(row) + k

sll $t0, $t0, 3 # $t0 = byte offset of [i][k]

addu $t0, $a1, $t0 # $t0 = byte address of y[i][k]

l.d $f18, 0($t0) # $f18 = 8 bytes of y[i][k]

mul.d $f16, $f18, $f16 # $f16 = y[i][k] * z[k][j]

add.d $f4, $f4, $f16 # f4=x[i][j] + y[i][k]*z[k][j]

addiu $s2, $s2, 1 # $k k + 1

bne $s2, $t1, L3 # if (k != 32) go to L3

s.d $f4, 0($t2) # x[i][j] = $f4

addiu $s1, $s1, 1 # $j = j + 1

bne $s1, $t1, L2 # if (j != 32) go to L2

addiu $s0, $s0, 1 # $i = i + 1

bne $s0, $t1, L1 # if (i != 32) go to L1

Accurate Arithme(c

• IEEE Std 754 specifies addi(onal rounding control – Extra bits of precision (guard, round, s(cky) – Choice of rounding modes – Allows programmer to fine-‐tune numerical behavior of a computa(on

• Not all FP units implement all op(ons – Most programming languages and FP libraries just use defaults

• Trade-‐off between hardware complexity, performance, and market requirements

Interpreta(on of Data

• Bits have no inherent meaning – Interpreta(on depends on the instruc(ons applied

• Computer representa(ons of numbers – Finite range and precision – Need to account for this in programs

Associa(vity • Parallel programs may interleave opera(ons in unexpected orders – Assump(ons of associa(vity may fail

Need to validate parallel programs under varying degrees of parallelism

x86 FP Architecture

• Originally based on 8087 FP coprocessor – 8 × 80-‐bit extended-‐precision registers – Used as a push-‐down stack – Registers indexed from TOS: ST(0), ST(1), …

• FP values are 32-‐bit or 64 in memory – Converted on load/store of memory operand – Integer operands can also be converted on load/store

• Very difficult to generate and op(mize code – Result: poor FP performance

x86 FP Instruc(ons

• Op(onal varia(ons – I: integer operand – P: pop operand from stack – R: reverse operand order – But not all combina(ons allowed

Data transfer Arithmetic Compare Transcendental FILD mem/ST(i)

FISTP mem/ST(i)

FLDPI

FLD1

FLDZ

FIADDP mem/ST(i)

FISUBRP mem/ST(i) FIMULP mem/ST(i) FIDIVRP mem/ST(i)

FSQRT

FABS

FRNDINT

FICOMP

FIUCOMP

FSTSW AX/mem

FPATAN

F2XMI

FCOS

FPTAN

FPREM

FPSIN

FYL2X

Streaming SIMD Extension 2 (SSE2)

• Adds 4 × 128-‐bit registers – Extended to 8 registers in AMD64/EM64T

• Can be used for mul(ple FP operands – 2 × 64-‐bit double precision – 4 × 32-‐bit double precision – Instruc(ons operate on them simultaneously

• Single-‐Instruc(on Mul(ple-‐Data

Right ShiH and Division

• LeH shiH by i places mul(plies an integer by 2i • Right shiH divides by 2i?

– Only for unsigned integers • For signed integers

– Arithme(c right shiH: replicate the sign bit – e.g., –5 / 4

• 111110112 >> 2 = 111111102 = –2 • Rounds toward –∞

– c.f. 111110112 >>> 2 = 001111102 = +62

Who Cares About FP Accuracy?

• Important for scien(fic code – But for everyday consumer use?

• “My bank balance is out by 0.0002¢!”

• The Intel Pen(um FDIV bug – The market expects accuracy

– See Colwell, The Pen*um Chronicles

Summary

• Computer arithme(c is constrained by limited precision • Bit pa@erns have no inherent meaning but standards do exist

– two’s complement – IEEE 754 floa(ng point

• Computer instruc(ons determine “meaning” of the bit pa@erns

• Performance and accuracy are important so there are many complexi(es in real machines (i.e., algorithms and implementa(on).

Concluding Remarks

• ISAs support arithme(c – Signed and unsigned integers – Floa(ng-‐point approxima(on to reals

• Bounded range and precision – Opera(ons can overflow and underflow

• MIPS ISA – Core instruc(ons: 54 most frequently used

• 100% of SPECINT, 97% of SPECFP – Other instruc(ons: less frequent

Documents

Arithme(c*ForComputers - UISwiki.sc3.uis.edu.co/images/a/a6/ArithArchi.pdf · Product 1001000* ... • Looks*alotlike*amul(plier!* ... and*ALU*width*by*half** Improved*Division* •

Arithme(cForComputers - UISwiki.sc3.uis.edu.co/images/a/a6/ArithArchi.pdf · Product 1001000 ... • Looksalotlikeamul(plier!* ... andALUwidthbyhalf** ImprovedDivision •