EC02-informatica-infocod

Embed Size (px)

Citation preview

  • 8/13/2019 EC02-informatica-infocod

    1/44

    Informtica

    Ing. Aeronutica

    Information coding

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    2/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    2

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    3/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaIntroduction

    !"#$% %'(%)*+#

    All information processed by a digital computer needs to be encoded:transformed it into a form of representation suitable for the computer.

    - Numerical values: magnitudes of computer applications related to geometry

    (longitudes, angles,...), physics (pressure, temperature, volumes, forces,...),

    mathematics, statistics, finances, etc

    - Text informationin different formats, like books, reports, manuals, etc

    - Media information: graphics, images, videos, sounds, etc.

    - Computer programs

    Computers encode information using a binary numbers rather than decimal

    numbers.- Binary encoding will be introduced in short.

    3

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    4/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaIntroduction

    !"#$% %'(%)*+#

    Information range: the area of variation between upper and lower limits of amagnitude. All the set of different values or codes that the information may take.

    - Example: assume that a magnitude like a length is encoded in decimal with 3

    integer digits and 2 decimal digits. The information range is [000.00 ... 999.99].

    Information accuracy: information resolution, i.e, the minimum representableinformation value.

    - Example: in the above representation, accuracy is 0.01 units.

    Information volume: amount of information. Number of measurements(information instances) of a magnitude times the number of digits of each

    measurement.- Example: in the above representation, 1000 length measurements have a

    volume of 5000 digits.

    - Useful to compute the capacity of a storing device.

    4

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    5/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaIntroduction

    !"#$% %'(%)*+#

    Information compression: reduction of the information volume by

    - Removing redundancy

    ! Represent repeated items in a compact form. Example 1000 consecutive whitepixels in a graphic.

    - Reducing accuracy! Use a representation with less digits.

    5

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    6/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    6

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    7/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaNumbering systems

    8'#$9'("5 (-./)0$(1 #2#+).#

    Numbers are represented as a sequence of digits where each digit has a weightaccording to its position.

    Numbering in base buses set of digits D= {0,1,2, ..., b-1}

    AssumeXis represented asXnXn!1...X2X1X0,X!1,X!2,...X!m in base b.

    Example:

    - In decimalb=10 and D= {0,1,2, ..., 9}

    7

    X =

    nX

    i=m

    Xi bi

    1, 234.5610 = 1 103 + 2 102 + 3 101 + 4 100+

    + 5 101 + 6 102

    !"#

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    8/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaNumbering systems

    :;) /$("02 #2#+).

    In binaryb=10 and D= {0,1}

    Converting from binary to decimal: use eq. (1)

    Converting from decimal to binary:

    8

    1101002 = 1 25

    + 1 24

    + 0 23

    + 1 22

    + 0 21

    + 0 20

    = 5210

    22

    2 2

    26

    0 13

    1 60 3

    1 1

    1

    2

    0 Stop26 (10= 11010 (2

    2 into 26 goes 13 times and 0 is left over

    2 squared, 2 cubed, 2 to the fourth,...

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    9/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaNumbering systems

    :;) /$("02 #2#+).

    A bitis binary digit.

    The rangeof a representation in base bwith ndigits is [0 ... bn!1]

    - The range corresponds to Pr(2,n): n-permutations of 2 elements with repetition.

    - Example: b=2, n=3

    ! [0 ... 23 !1] = [000, 001, 010, 011, 100, 101, 110, 111]

    A byteis an 8-bit binary code.

    - The range of this representation is [010 ... 25510].

    The information volumein the binary system is usually measured as the number

    of bytes. Multiples of the byte:- KbKilobyte = 210bytes=1,024 bytes

    - MbMegabyte = 220bytes=1,024 Kbytes

    - GbGigabyte = 230bytes=1,024 Mbytes

    - TbTerabyte = 240bytes=1,024 Gbytes

    9

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    10/44

  • 8/13/2019 EC02-informatica-infocod

    11/44

  • 8/13/2019 EC02-informatica-infocod

    12/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaNumbering systems

    !$("02 "0$+;.)9%#

    Algorithms for operations with two bits or more:

    12

    $%%&'() *+,-./0'() 1+2'32&0/'()

    !"##$

    !"##$

    %#&'

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    13/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaNumbering systems

    ;)6?@ #2#+).

    In hexadecimalb=16, d = {0, 1, 2, ..., 9, A, B, C, D, E, F}

    Converting from hex to decimal:

    Converting from decimal to hex: algorithm successive divisions

    Converting between binary and hex

    - Group binary digits in fours. Four binary digits correspond to one hex digit

    13

    0x7F9A = 7 163 + F 162 + 9 161 + A 160 =

    7 163 + 15 162 + 9 161 + 10 160 = 43, 51110

    8 A C 2 16

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    14/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    14

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    15/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    A$1(B"(4B."1($+-4) 0)*0)#)(+"9'(

    Allocate the most significant bit to represent the sign.

    - 0for positive numbers, 1for negative numbers.

    The remaining bits indicate the magnitude (or absolute value).

    Example:

    - 001010102= 4210 , 101010102= -4210

    Representation rangewith nbits: [!2n!1!1, ..., 2n!1!1].

    - With 8 bits: [!127,...,+127]

    Disadvantages:

    - Two different zeros:00000000 (0) and 10000000 (-0).

    15

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    16/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    :C'D#B%'.*5).)(+ 0)*0)#)(+"9'(

    The representation of a negative number -x in n-bits is defined as its twos-complement.

    The twos complement can be calculated in decimal as 2n!x modulus 2n:

    ! -x !C2(x,n) = (2n!x) % 2n (with |x| < 2n)

    ! Examples:

    ! C2(3,4) = (24!3) % 24 = 1310 = 11012"-3

    ! C2(3,8) = (28!3) % 28 = 25310= 1111 11012 (sign extension) "-3

    ! C2(-3,4)= (24+ 3) % 24= 310 = 00112"3

    ! C2(0,4) = (24!0) % 24 = 010 = 00002"0

    16

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    17/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    :C'D#B%'.*5).)(+ 0)*0)#)(+"9'(

    The twos complement of a binary n-bit representation is a new representationwith range [!2n!1, ..., 2n!1!1]in which:

    - Codes [0,... ,2n!1!1] "positive integers [0,... ,2n!1!1]

    - Codes [2n!1,... ,2n!1] "negative integers [-2n-1,... ,!1]

    Example- n=4 "range = [-8, 7]

    - codes [0,7]"positive integers[0,7], codes [8,15]"negative integers[-8,-1]

    17

    ,&)/.4 0(%6 +)7&8)6% 9:(;7

    0(>>> > >>>>" " "

    >>"> ? ?

    >>"" @ @

    >">> A A

    >">" B B

    >""> C C

    >""" D D

    ,&)/.4 0(%6 +)7&8)6% 9:(;7

    0(>" G FD

    ">"> "> FC

    ">"" "" FB

    "">> "? FA

    "">" "@ F@

    """> "A F?

    """" "B F"

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    18/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    :C'D#B%'.*5).)(+ 0)*0)#)(+"9'(

    Converting from decimal to twos complement:

    1. Invert the bits

    2. Add one

    - Example:

    ! C2(3,4) : 0011"(invert)"1100"(add 1)"1101

    ! C2(-3,4) : 1101"(invert)"0010"(add 1)"0011

    Converting from twos complement to decimal:

    - ifit is a positive number (MSB=0), then apply weighted digits eq. (1):

    !

    Example: 0111 = 0 x 2

    3

    + 1 x 2

    2

    + 1 x 2

    1

    + 1 x 2

    0

    = 710- if it is a negative number (MSB=1), then compute the twos complement and

    apply weighted digits eq. (1) to get the absolute value. Next, change the sign ofthe absolute value.

    ! Example: 1111 "(invert)"0000 "(add 1)"0001

    ! 0001 = 0 x 23+ 0 x 22+ 0 x 21+ 1 x 20= 110"-110

    18

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    19/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    :C'D#B%'.*5).)(+ 0)*0)#)(+"9'(

    Property 1:x - y = x + (-y) = x + C2(y,n)

    ! Example

    ! 2 - 3 =0010 !0011 = 1111 = -110

    ! 2 - 3 = 2 + (-3) = 0010 + 1101 = 1111 = -110

    Property 2:only one zero "0......0000

    Property 3: sign extension.

    - Positive numbers have MSB=0

    - Negative numbers have MSB=1

    - C2(3,4) = 11012"C2(3,8) = 1111 11012 (sign extension)

    19

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    20/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    36%)##BE 0)*0)#)(+"9'(

    Excess-K (also called biased representation) of an n-bit representation is arepresentation with range [!K, ..., 2n!1!K]that uses a pre-specified number Kas a biasing value to displace the origin of the representation so as to map themost negative number of the representation (-K) to the code 0000.

    Example

    - Excess-K, K=8, n=4"range = [-8, 7]

    - codes [0,7]"negative integers[-8,-1], codes [8,15]"positive integers[0,7]

    20

    ,&)/.4 0(%6 +)7&8)6% HI0677FE

    >>>> > FE

    >>>" " FD

    >>"> ? FC

    >>"" @ FB

    >">> A FA

    >">" B F@

    >""> C F?

    >""" D F"

    ,&)/.4 0(%6 +)7&8)6% HI0677FE

    ">>> E >

    ">>" G "

    ">"> "> ?

    ">"" "" @

    "">> "? A

    "">" "@ B

    """> "A C

    """" "B D

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    21/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    36%)##BE 0)*0)#)(+"9'(

    Converting from decimal to excess-K:

    - Add K to x in decimal and then convert it to binary

    - Examples: Assume n=4, K=8.

    ! x=-3!(add 8)!-3+8=5!(binary)!0101

    ! x=3!(add 8)!3+8=11!(binary)!1011

    Converting from excess-K to decimal:

    - Convert it to decimal and then subtract K

    - Examples: Assume n=4, K=8.

    !

    x=0011!

    (decimal)!

    3!

    (subtract 8)!

    3-8=-5! x=1011!(decimal)!11!(subtract 8)!11-8=3

    21

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    22/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    36%)##BE 0)*0)#)(+"9'(

    Property 1: it is monotonic increasing, so it eases to perform comparisons (>,

  • 8/13/2019 EC02-informatica-infocod

    23/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding signed numbers

    A-.."02 'F #$1()4 0)*0)#)(+"9'(#

    23

    /$("02 %'4) -(#$1()4 #$1( G

    ."1($+-4H:C'D#%'.*5I

    36%)##BJ

    KKKK > > > FE

    KKKL " " " FD

    KKLK ? ? ? FC

    KKLL @ @ @ FBKLKK A A A FA

    KLKL B B B F@

    KLLK C C C F?

    KLLL D D D F"

    LKKK E F> FE >

    LKKL G F" FD "LKLK "> F? FC ?

    LKLL "" F@ FB @

    LLKK "? FA FA A

    LLKL "@ FB F@ B

    LLLK "A FC F? C

    LLLL "B FD F" D

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    24/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    24

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    25/44

  • 8/13/2019 EC02-informatica-infocod

    26/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    M5'"9(1B*'$(+ 0)*0)#)(+"9'(

    It consists of a fixed number of significant digits, called mantissa, which arescaled them using an exponent. The basefor the scaling is usually 2 or 10:

    mantissa "baseexponent

    - Examples of the same number using different exponents (scaling factors):

    ! 1125.0"100 112.5"101 11.25"102 1.125"103 0.1125"104

    - The point can float, i.e., be placed anywhere relative to the significant digits of

    the number.

    Normalized representation: the one that the point follows the most significantdigit different from zero: 1,125"103.

    Advantage: it supports a much wider range of values with the same number ofdigits

    26

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    27/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    N333 A+"(4"04 F'0 M5'"9(1B 8'$(+ O0$+;.)9% =N333 PQR@

    It describes several formats with different accuracies. A given format comprises:

    - Finite numbers. Described by three integers (s,c,q). The value of the number is:

    - (!1)s"c "bq

    - Two infinities: +"and !".

    - Two kinds of NaN(Not A Number)

    Finite numbers:

    - s: the sign (zero or one).

    - c: is the mantissa (also called significand or coefficient).

    ! Uses sign-and-magnitude format. The sign of the mantissa is the sign bit.

    ! Normalized format " the point follows the most significant digit different from zero. Sincethis bit is always a 1, it is implied and there is no need to store it.

    - q: is the exponent.

    ! Uses excess-K representation with K = 2ne !1!1 ,wherene: number of bits of the exponent.

    ! K=15 for IEEE-16, K=127 for IEEE-32, and K=1023 for IEEE-64.

    - b: is the base which may be 2 or 10.27

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    28/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    N333 A+"(4"04 F'0 M5'"9(1B 8'$(+ O0$+;.)9% =N333 PQR@

    28

    Name Base Digits Digits Digits Min. Max.

    Total Mantissa Exponent Number Number

    Half precision 2 16 10+1 5 9.3132 1010 4.2949 109

    Single precision 2 32 23+1 8 1.1754 1038 3.4028 1038

    Double precision 2 64 52+1 11 2.2250 10308 1.7977 10308

    !"#$ %&'($)$* +,$-..,

    / 0 /1

    / 2 34

    / // 03

    ! !!!! !!!! !!!!!!!

    ! !!!! !!!! !!!!#!!

    $ %%%% %%%% %%%%!!!

    % %%%% %%%% %%%%%%&

    % %%%% %%%% %%%%%%%

    ! !!!! !!!! !!!!#!'

    !"#$%&

    '()*%&

    +,%-

    56

    6

    7,7

    +"$

    1

    +,&

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    29/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    N333 A+"(4"04 F'0 M5'"9(1B 8'$(+ O0$+;.)9% =N333 PQR@

    Converting fromfloating-point to decimal- Example single: 7F7F FFFF16

    - Sign: leading bit"0 "positive number.

    - Exponent: 8 bits after the sign"111 1111 02= 25410.

    - It is in in excess-127"254 !127 = 127

    - Mantissa: 23 bits after the exponen plus the implied bit which is always 1.

    - It is: 1,11111....1. Represented with sign-and magnitude.

    - Use eq (1) to get the value in decimal:- 1#20+ 1#2!1+ 1#2!2+ + 1#2!23= 1.999999880790710 "2

    Result: +2 "2127!3.4028 "1038

    29

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    30/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    N333 A+"(4"04 F'0 M5'"9(1B 8'$(+ O0$+;.)9% =N333 PQR@

    Converting fromdecimal to floating-point (1)- Example -29.6875 to double

    - Convert the absolute value of the number to binary. Convert the integral and

    fractional parts separately:

    ! 2910= 111012

    ! 0.6875 "2 = 1.375 "1

    ! 0.3750 "2 = 0.750 "0

    ! 0.75 "2 = 1.5 "1

    ! 0.5 "2 = 1.0 "1

    ! 0.687510= 0.10112"29.687510= 11101.10112= 11101.10112"20

    - Normalize the number: 11101.10112"20"1.110110112"24

    - Generate the mantissa. Omit the implied one. Fill with zeros on the right up to the

    52 bits of the mantissa. Using hex notation:

    ! 1101 1011 0000 0000 ... 00002= D B000 0000 00001630

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    31/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding real numbers

    N333 A+"(4"04 F'0 M5'"9(1B 8'$(+ O0$+;.)9% =N333 PQR@

    Converting fromdecimal to floating-point (2): example -29.6875 to double

    - Generate the exponent: expressed in excess-1023. For IEEE-64 the bias is

    1023. Add the bias:

    ! 410+ 102310= 102710= 100 0000 00112= 40316

    - Set the sign bit: 1 "negative

    - Place the sign, exponent, and mantissa into the fields of the IEEE format:

    Result: !29.687510= C03D B000 0000 000016IEEE64

    31

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    32/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    32

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    33/44

  • 8/13/2019 EC02-informatica-infocod

    34/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding texts

    :;) OASNN %'4)

    The American Standard Code for Information Interchange is a 7-bit codingscheme that supports the English alphabetand control characters.

    Example:

    - sends to the console the following codes (decimal):

    - 65 110 32 9 32 65 83 67 73 73 32 13 32 10 32 116 101 120 116 32 10

    Drawback: it lacks for symbols from other languagesSolutions:

    - Extended 8-bit ASCII coding: ISO 8859-1 standard, known asISO Latin 1

    - Unicode ...

    34

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    35/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding texts

  • 8/13/2019 EC02-informatica-infocod

    36/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding texts

    :;) U($%'4) #+"(4"04 =NAVWN3S LKXRX@

    Attempt to create a universal character set with support for most of the worldswriting systems.

    Not only a character chart; it defines a complete encoding methodology. It dealswith aspects like:

    - Character properties (upper and lower case)- Rules for composition of characters with different types of accents

    - Normalization rules for obtaining equivalent forms, etc

    It specifies a name and a unique numeric identifier for each character or symbol,named the code point.

    Originally this identifier was intended to be coded as a 16-bit integer, but overtime it proved to be insufficient.

    36

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    37/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding texts

    :;) U($%'4) #+"(4"04 =NAVWN3S LKXRX@

    Unicode defines three encoding forms under the name UTF (Unicode TransformationFormat):

    - UTF-8- byte oriented coding with variable length symbols (1 to 4 bytes per Unicode

    character).

    ! One-Byte: Those listed in US-ASCII, a total of 128 characters.

    ! Two-byte: A total of 1920 characters. Includes the characters romances diacritics, andGreek, Cyrillic, Coptic, Armenian, Hebrew, Arabic, Syriac ...

    ! Three-byte: Unicode Basic Multilingual Plane, which together with the previous group,includes CJK characters in the group: Chinese, Japanese and Korean.

    ! Four-byte: Supplemental multilingual plane. Mathematical symbols. Linear B syllabic andideographic alphabet Persian, Phoenician ... And the supplementary ideographic plane:

    Han characters used unusual.

    - UTF-16- it uses a 16-bit code for the Basic Multilingual Plane (BMP) and two 16-bit

    (surrogates pairs) for additional less frequent planes.

    - UTF-32- 32-bit encoding of fixed length, and the simplest of the three.

    World Wide Web was ASCII until December 2007, when it was surpassed by UTF-8.

    37

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    38/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaEncoding texts

    M'0."Y)4 +)6+

    A markup language is a way to encrypt a document which, in addition to thetext, includes labels or markings to specify the structure of the text.

    - Examples: HTML, nroff, troff, LaTeX, RTF

    RTF(Rich Text Format) used for text editing:

    {\rtf1\ansi\ansicpg1252\cocoartf1138 {\fonttbl\f0\froman\fcharset0 TimesNewRomanPSMT;}

    {\colortbl;\red255\green255\blue255;}

    \pard This is a {\b boldface} example.

    }

    Presentational markup: used by traditional text editors. Marking is performed bythe text editor in such a way that marking is hidden from human usersproducing the WYSIWYG (What You See Is What You Get) effect.

    Procedural marking: used by LaTeX and some HTML editors. In these systemsthe user explicitly writes the formatting labels in the source file.

    38

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    39/44

    Informtica

    Ing. AeronuticaInformation coding

    !"#$% %'(%)*+#

    ,-./)0$(1 #2#+).#

    3(%'4$(1 #$1()4 (-./)0#

    3(%'4$(1 0)"5 (-./)0#

    3(%'4$(1 +)6+#

    7)4-(4"(%2 "(4 %'.*0)##$'(

    39

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    40/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaRedundancy and compression

    7)4-(4"(+ )(%'4$(1

    Information may get corrupted when it is transmitted through communicationlines or stored in disks or other storing devices.

    Redundancy is used to detectand to detect-and-correcterrors.

    - Error detection: parity bit, checksums

    - Error detection and correction: ECC. They require higher levels of redundancy.

    40

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    41/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaRedundancy and compression

    7)4-(4"(+ )(%'4$(1

    A parity bit:redundantbit added to a set of bits to ensure that the number ofbits with value 1 in the outcome is even or odd.

    - Even parity: 1100 0011

    - Odd parity:0100 0011

    Parity bits are often used when transmitting ASCII characters from/to peripherals.

    41

    !"#$%&'$$%&'$(')*+,

    -)+.($*((*%.

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    42/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaRedundancy and compression

    N(F'0."9'( %'.*0)##$'(

    Data compression: process of transforming an encoded information using fewerbits than the original representation uses.

    - Goal: to reduce the information volume and the consumption of expensive

    resources, such as hard disk space or transmission bandwidth.

    It has a cost: extra processing for compressing-decompressing.

    - Trade-off between the costs of encoding and decoding: time consuming

    compression"time efficient decompressing. And viceversa.

    Two types of compression:

    - Lossless compression: the encoded data is not distortioned or modified, so it

    can reconstructed from the compressed data.! Example: text compression. ZIP format

    - Lossy compression: the original data is only approximately represented. It only

    allows to reconstruct an approximation of the original data.

    ! Example: image/audio compression. PNG, GIF, MPEG, MP3 formats

    42

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    43/44

    J. Vila & E. Hernndez

    Informtica

    Ing. AeronuticaRedundancy and compression

    Z'##5)## %'.*0)##$'(

    Lossless algorithms usually exploit statistical redundancy in such a way thatmore frequent data are represented with fewer bits.

    Huffman coding

    - Example: text with only four characters: , A, B, C with frequencies 45%,

    35%, 15% and 5% respectively.

    - Compressing ratio:

    43

    !"#$" !"#$%

    !"%$" #&$

    !"&$" #'$

    !"!$" #($ !"'!

    !"$$

    &"!!

    !!!

    !!"

    !!

    !

    !"

    "

    r = 10.45 1 + 0.35 2 + 0.15 3 + 0.05 3

    2

    = 12.5%

    mircoles 12 de febrero de 14

  • 8/13/2019 EC02-informatica-infocod

    44/44

    Informtica

    Ing. AeronuticaRedundancy and compression

    Z'##2 %'.*0)##$'(

    It compresses data by discarding (losing) some of it.

    Usually based on perceptual coding: transforming the raw data obtained from adevice to a domain that more accurately reflects the information content.

    - Example: a sound file can be more efficiently represented as the frequency

    spectrum over time than as the amplitude levels.

    Lossy encoding/decoding programs are usually known as codecs.

    Key point: required accuracy or Quality of Service (QoS).

    - Example: image qualities for video conference 640x480, 800x600,

    1920x1080, ...