Upload
rachit-gupta
View
220
Download
0
Embed Size (px)
Citation preview
7/30/2019 lecture1_sourcecode
1/7
Information Theory 1
3F1 - Signals and Systems
Michaelmas 2005
Information TheoryHandout 1
Andrea Lecchini Visintini
30 November 2005
2 Engineering Part IIA: 3F1 - Signals and Systems
7/30/2019 lecture1_sourcecode
2/7
Information Theory 3
Source coding
Source characters: {x1, x2, . . . , xM}
e.g. English alphabet { A , B , C , . . . }
A message is a sequence of source characters
e.g. HELLO
Code characters: {1, . . . , D}
A code is a rule which assigns a sequence of code char-acters to each element of the source alphabet.
Each of these sequences is called a code word.
Here we will consider binary {0, 1} codes.
e.g. H = 1011
4 Engineering Part IIA: 3F1 - Signals and Systems
Design of a code
(1) Unique decipherability
Every finite sequence of code characters must correspondto at most one message
e.g. if the code is
x1 0x2 010
x3 01
x4 10
thenx2x3 x1
x1 x4 010 ?
Here we will consider a class of uniquely decipherable
codes denoted instantaneous codes.
(2) Efficiency
The shortest code words must be assigned to the sourcesymbols which are transmitted more frequently.
e.g. in Morse code E = Q =
The efficiency of a certain code for a certain source de-pends on the frequency of the symbols {x1, x2, . . . , xM}
produced by the source and on the lengths of the code
words.
7/30/2019 lecture1_sourcecode
3/7
Information Theory 5
Instantaneous decipherability
Instantaneous code: a code which satisfies the prefixcondition.
Prefix condition: no code word is a prefix of another
code word
e.g. if a code word is 01 then there cannot be 01011
e.g. an instantaneous code
x1 0
x2 100
x3 101
x4 11
Decoding Algorithm
Given a finite sequence of code characters
1 Proceed from the left until a code word is found
2 Repeat until the end of the message
101110100101 x3
x4x1
x2
x3
Every instantaneous code is uniquely decipherable butthe converse is not true.
There is a relationship between instantaneous decipher-
ability and the lengths of the code words. This relation-ship is important because the efficiency of a code will beevaluated on the basis of its code word lengths.
6 Engineering Part IIA: 3F1 - Signals and Systems
Theorem (Kraft inequality)
Let {x1, . . . , xM} be the source characters. Then a binaryinstantaneous code with word lengths {n1, n2, . . . , nM}
exists if and only if
Mi=1
2ni 1
Some examples:
x1 0 n1 = 1
x2 1 n2 = 1
2
i=1
2ni =1
2+
1
2= 1
x1 0 n1 = 1
x2 100 n2 = 3x3 101 n3 = 3
x4 11 n4 = 2
4
i=12ni =
1
2+
1
8+
1
8+
1
4= 1
x1 00 n1 = 2
x2 100 n2 = 3
x3 110 n3 = 4
3i=1
2ni =1
4+
1
8+
1
16=
7
16
7/30/2019 lecture1_sourcecode
4/7
Information Theory 7
ProofFirstly we prove that the inequality is a necessary con-dition, i.e. any instantaneous code must satisfy it.
Let nbe the maximum number between {n1, n2, . . . , nM}.Consider the table of all the 2n possible words of length n
e.g.0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 00 1 1 1
1 0 0 0 n = 4
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 01 1 1 1
Each of the possible code words with lengths {n1, n2, . . . , nM}
is a prefix of some lines of the table.
However, since no code word can be a prefix for another
code word, each line in the table must correspond to atmost one code word. In fact, if two code words are both
a prefix of the same line then one of them is a prefix ofthe other.
A code word of length ni occupies 2(nni) lines of the ta-
ble.
8 Engineering Part IIA: 3F1 - Signals and Systems
e.g. 01 occupies 0 1 0 00 1 0 1
0 1 1 0
0 1 1 1
Since in the table there are 2n lines, the sum of the lines
occupied by each word must be less than 2n.
HenceMi=1
2nni 2n Mi=1
2ni 1
Therefore any instantaneous code must satisfy the in-
equality.In order to prove sufficiency we need to show that iflengths {n1, n2, . . . , nM} are given, which satisfy the in-equality, then a set of code words, having these lengths
and satisfying the prefix condition, can be constructed.
The code words can easily be generated using the same
procedure. Let n be again the maximum lengths. Con-struct the table of all the possible words of that length
in ascending binary order. Take the code word lengthsin ascending order, assign 2(nni) words to each wordlength and record the common prefix (of length ni) as
the code word.
7/30/2019 lecture1_sourcecode
5/7
Information Theory 9
Efficiency of a code
To measure the efficiency of a code for a given source weneed to know the frequencies of the symbols {x1, . . . , xM}
produced by the source. This can be done a priori by in-troducing a probabilistic model of the source.
Definition (Information source)An information source X is a sequence of random vari-
ables X0, X1, X2, . . . such that:1 each Xt takes on values {x1, x2, . . . , xM}; and
2 the sequence is stationary, i.e.
P{Xt1 = xi1, . . . , X tk = xik} = P{Xt1+h = xi1, . . . , X tk+h = xik}
for all nonnegative integers t1, . . ,tk, i1, . . ,ik, k and h.
Given a source X, we denote with Xt the generic ran-
dom variable in the sequence. The simplest model of aninformation source is the memoryless model which is asequence of independent random variables.
Given a source X, such that Xt takes on values{x1, x2, . . . , xM} with probabilities {p1, p2, . . . , pM}, the
efficiency of a code is then measured by the average
code word length which is given by:
L =Mi=1
pi ni
where {n1, n2, . . . , nM} are the code word lengths assigned
to {x1, x2, . . . , xM}.
Our objective now is to design codes which:
(1) are instantaneously decipherable; and
(2) have small average code word length.
10 Engineering Part IIA: 3F1 - Signals and Systems
Definition of Entropy (Shannon)
Given a random variable X, which takes on values{x1, x2, . . . , xM} with probabilities {p1, . . . , pM}, the quan-tity
H(X) = Mi=1
pi log2pi
is the Entropy ofX.
Theorem (lower bound on efficiency)
Let X be an information source such that Xt takes onvalues {x1, x2, . . . , xM} with probabilities {p1, p2, . . . , pM}.Then the average code word length, of any binary in-
stantaneous code which encodes X, satisfies:
L H(Xt)
Proof
H(Xt) L =Mi=1
pi
log2
1
pi ni
=Mi=1
pi log22ni
pi
=
Mi=1
pi log2(e) ln 2ni
pisince loga(x) = loga(b)logb(x)
Mi=1
pi log2(e)
2ni
pi 1
since ln(x) x 1
= log2(e)
M
i=1
2ni M
i=1
pi
SinceM
i=1 2ni 1 (for instantaneous decipherability)
the last expression on the right hand side is less or equal
to zero.
7/30/2019 lecture1_sourcecode
6/7
Information Theory 11
In general it is not possible to construct a code which at-
tains the lower bound on the average code word length
but it always possible to construct a code which has av-erage code worth length less than the lower bound plusone bit.
Theorem
Let X be an information source. Then there exists a bi-nary instantaneous code such that
L < H(Xt) + 1
Proof
Choose code word lengths ni such that ni = log2 (pi).This means that
log2 (pi) ni < log2 (pi) + 1 .
An instantaneous code with such code word lengths ex-ists because the left hand side of the inequality implies
thatMi=1
2ni Mi=1
2log2(pi) =Mi=1
pi = 1
As for average code word length, we have that the right
hand side of the inequality implies
L =Mi=1
pini