lecture1_sourcecode

7/30/2019 lecture1_sourcecode

1/7

Information Theory 1

3F1 - Signals and Systems

Michaelmas 2005

Information TheoryHandout 1

Andrea Lecchini Visintini

30 November 2005

2 Engineering Part IIA: 3F1 - Signals and Systems


2/7


Source coding

Source characters: {x1, x2, . . . , xM}

e.g. English alphabet { A , B , C , . . . }

A message is a sequence of source characters

e.g. HELLO

Code characters: {1, . . . , D}

A code is a rule which assigns a sequence of code char-acters to each element of the source alphabet.

Each of these sequences is called a code word.

Here we will consider binary {0, 1} codes.

e.g. H = 1011


Design of a code

(1) Unique decipherability

Every finite sequence of code characters must correspondto at most one message

e.g. if the code is

x1 0x2 010

x3 01

x4 10

thenx2x3 x1

x1 x4 010 ?

Here we will consider a class of uniquely decipherable

codes denoted instantaneous codes.

(2) Efficiency

The shortest code words must be assigned to the sourcesymbols which are transmitted more frequently.

e.g. in Morse code E = Q =

The efficiency of a certain code for a certain source de-pends on the frequency of the symbols {x1, x2, . . . , xM}

produced by the source and on the lengths of the code

words.


3/7


Instantaneous decipherability

Instantaneous code: a code which satisfies the prefixcondition.

Prefix condition: no code word is a prefix of another

code word

e.g. if a code word is 01 then there cannot be 01011

e.g. an instantaneous code

x1 0

x2 100

x3 101

x4 11

Decoding Algorithm

Given a finite sequence of code characters

1 Proceed from the left until a code word is found

2 Repeat until the end of the message

101110100101 x3

x4x1

x2

x3

Every instantaneous code is uniquely decipherable butthe converse is not true.

There is a relationship between instantaneous decipher-

ability and the lengths of the code words. This relation-ship is important because the efficiency of a code will beevaluated on the basis of its code word lengths.


Theorem (Kraft inequality)

Let {x1, . . . , xM} be the source characters. Then a binaryinstantaneous code with word lengths {n1, n2, . . . , nM}

exists if and only if

Mi=1

2ni 1

Some examples:

x1 0 n1 = 1

x2 1 n2 = 1

2

i=1

2ni =1

2+

1

2= 1

x1 0 n1 = 1

x2 100 n2 = 3x3 101 n3 = 3

x4 11 n4 = 2

4

i=12ni =

1

2+

1

8+

1

8+

1

4= 1

x1 00 n1 = 2

x2 100 n2 = 3

x3 110 n3 = 4

3i=1

2ni =1

4+

1

8+

1

16=

7

16


4/7


ProofFirstly we prove that the inequality is a necessary con-dition, i.e. any instantaneous code must satisfy it.

Let nbe the maximum number between {n1, n2, . . . , nM}.Consider the table of all the 2n possible words of length n

e.g.0 0 0 0

0 0 0 1

0 0 1 0

0 0 1 1

0 1 0 0

0 1 0 1

0 1 1 00 1 1 1

1 0 0 0 n = 4

1 0 0 1

1 0 1 0

1 0 1 1

1 1 0 0

1 1 0 1

1 1 1 01 1 1 1

Each of the possible code words with lengths {n1, n2, . . . , nM}

is a prefix of some lines of the table.

However, since no code word can be a prefix for another

code word, each line in the table must correspond to atmost one code word. In fact, if two code words are both

a prefix of the same line then one of them is a prefix ofthe other.

A code word of length ni occupies 2(nni) lines of the ta-

ble.


e.g. 01 occupies 0 1 0 00 1 0 1

0 1 1 0

0 1 1 1

Since in the table there are 2n lines, the sum of the lines

occupied by each word must be less than 2n.

HenceMi=1

2nni 2n Mi=1

2ni 1

Therefore any instantaneous code must satisfy the in-

equality.In order to prove sufficiency we need to show that iflengths {n1, n2, . . . , nM} are given, which satisfy the in-equality, then a set of code words, having these lengths

and satisfying the prefix condition, can be constructed.

The code words can easily be generated using the same

procedure. Let n be again the maximum lengths. Con-struct the table of all the possible words of that length

in ascending binary order. Take the code word lengthsin ascending order, assign 2(nni) words to each wordlength and record the common prefix (of length ni) as

the code word.


5/7


Efficiency of a code

To measure the efficiency of a code for a given source weneed to know the frequencies of the symbols {x1, . . . , xM}

produced by the source. This can be done a priori by in-troducing a probabilistic model of the source.

Definition (Information source)An information source X is a sequence of random vari-

ables X0, X1, X2, . . . such that:1 each Xt takes on values {x1, x2, . . . , xM}; and

2 the sequence is stationary, i.e.

P{Xt1 = xi1, . . . , X tk = xik} = P{Xt1+h = xi1, . . . , X tk+h = xik}

for all nonnegative integers t1, . . ,tk, i1, . . ,ik, k and h.

Given a source X, we denote with Xt the generic ran-

dom variable in the sequence. The simplest model of aninformation source is the memoryless model which is asequence of independent random variables.

Given a source X, such that Xt takes on values{x1, x2, . . . , xM} with probabilities {p1, p2, . . . , pM}, the

efficiency of a code is then measured by the average

code word length which is given by:

L =Mi=1

pi ni

where {n1, n2, . . . , nM} are the code word lengths assigned

to {x1, x2, . . . , xM}.

Our objective now is to design codes which:

(1) are instantaneously decipherable; and

(2) have small average code word length.


Definition of Entropy (Shannon)

Given a random variable X, which takes on values{x1, x2, . . . , xM} with probabilities {p1, . . . , pM}, the quan-tity

H(X) = Mi=1

pi log2pi

is the Entropy ofX.

Theorem (lower bound on efficiency)

Let X be an information source such that Xt takes onvalues {x1, x2, . . . , xM} with probabilities {p1, p2, . . . , pM}.Then the average code word length, of any binary in-

stantaneous code which encodes X, satisfies:

L H(Xt)

Proof

H(Xt) L =Mi=1

pi

log2

1

pi ni

=Mi=1

pi log22ni

pi

=

Mi=1

pi log2(e) ln 2ni

pisince loga(x) = loga(b)logb(x)

Mi=1

pi log2(e)

2ni

pi 1

since ln(x) x 1

= log2(e)

M

i=1

2ni M

i=1

pi

SinceM

i=1 2ni 1 (for instantaneous decipherability)

the last expression on the right hand side is less or equal

to zero.


6/7


In general it is not possible to construct a code which at-

tains the lower bound on the average code word length

but it always possible to construct a code which has av-erage code worth length less than the lower bound plusone bit.

Theorem

Let X be an information source. Then there exists a bi-nary instantaneous code such that

L < H(Xt) + 1

Proof

Choose code word lengths ni such that ni = log2 (pi).This means that

log2 (pi) ni < log2 (pi) + 1 .

An instantaneous code with such code word lengths ex-ists because the left hand side of the inequality implies

thatMi=1

2ni Mi=1

2log2(pi) =Mi=1

pi = 1

As for average code word length, we have that the right

hand side of the inequality implies

L =Mi=1

pini

Documents

lecture1_sourcecode