32
符符符符 ...coding theory The official language of this course is English slides and talks are (basically) in English I will accept questions and comments in Japanese also. omnibus-style lecture ... collection of several subjects “take-home” test ... questions are given, 1

符号理論 ...coding theory

Embed Size (px)

DESCRIPTION

符号理論 ...coding theory. The official language of this class is English slides and talks are (basically) in English I will accept questions and comments in Japanese also. omnibus-style lecture ... collection of several subjects “take-home” test ... questions are given, solve in your home. - PowerPoint PPT Presentation

Citation preview

Page 1: 符号理論 ...coding theory

符号理論 ...coding theory

The official language of this course is Englishslides and talks are (basically) in EnglishI will accept questions and comments in Japanese also.

omnibus-style lecture ... collection of several subjects“take-home” test ... questions are given, solve in your home

1

Page 2: 符号理論 ...coding theory

The Name of the Class

Coding Theory?a branch of Information Theoryproperties/constructions of “codes”

source codes (for data compression)channel codes (for error-correction)and various codes for various purposes

2

middle-class of Information Theory,with some emphasis on the techniques of coding

this class =

Page 3: 符号理論 ...coding theory

relation to Information Theory

3

measuring ofinformation

source coding

channel coding

entropymutual information

Kraft’s inequalityHuffman code

linear codeHamming code

source coding theoremuniversal code

analysis of codes

channel coding theoremconvolutional code

Turbo & LDPC codes

codes for data recordingnetwork codingand more...

Page 4: 符号理論 ...coding theory

class plan

seven classes, one testOct. 8 review brief review of information theoryOct. 15compress arithmetic code, universal codesOct. 22analyze analysis of codes, weight distributionOct. 29struggle cyclic code, convolutional codeNov. 5 Shannon channel coding theoremNov. 12 frontier Turbo code, LDPC codeNov. 19 unique coding for various purposes

take-home testNov. 26 no classslides ... http://isw3.naist.jp/~kaji/lecture/

4

Page 5: 符号理論 ...coding theory

Information Theory

Information Theory (情報理論)is founded by C. E. Shannon in 1948focuses on mathematical theory of communicationgave essential impacts on today’s digital technology

wired/wireless communication/broadcastingCD/DVD/HDDdata compressioncryptography, linguistics, bioinformatics, games, ...

5

Claude E. Shannon1916-2001

Page 6: 符号理論 ...coding theory

the model of communication

A communication system can be modeled as;

6

C.E. Shannon, A Mathematical Theory of Communication,The Bell System Technical Journal, 27, pp. 379–423, 623–656, 1948.

engineering artifacts

Other components are “given” and not controllable.

Page 7: 符号理論 ...coding theory

the first step

precise measurement is essential in engineeringvs.

information cannot be measured

To handle information by engineering means,we need to develop a quantitative measure of

information.

7

𝑿Entropy makes it!

Page 8: 符号理論 ...coding theory

the model of information source

information source = a machinery that produces symbols.The symbol produced is determined probabilistically.Use a random variable to represent the produced symbol.

takes either one value in . denotes the probability that .

8

𝑋 𝑃 𝑋 (1 )=𝑃 𝑋 (2 )=…=𝑃 𝑋 (6)=1/6𝐷 ( 𝑋 )={1,2,3,4,5,6 }

(We mainly focus on memoryless & stationary sources.)

Page 9: 符号理論 ...coding theory

entropy

the entropy of :

9

the expected value of over all is sometimes called as a self information of . is sometimes called as an expected information of .

𝐻 (𝑋 )= ∑𝑥∈𝐷 (𝑋 )

−𝑃 𝑋 (𝑥) log 2𝑃 𝑋 (𝑥)(bit)

𝑋 bit

𝑃 𝑋 (1 )=𝑃 𝑋 (2 )=…=𝑃 𝑋 (6)=1/6

Page 10: 符号理論 ...coding theory

entropy and uncertainty ( 不確実さ)

10

𝑋 𝑃 𝑋 (1 )=0.9

bit

𝐻 ( 𝑋 )=−0.9 log20.9−0.02 log20.02− …−0.02 log 20.02 bit

𝑃 𝑋 (2 )=…=𝑃 𝑋 (6 )=0.02

cheat dice...easier to guess

More difficulty to guess the value of correctly,more entropy is.

entropy = the size of uncertainty

Page 11: 符号理論 ...coding theory

basic properties of entropy

... 【 nonnegative 】

... 【 smallest value 】when for one particular value in

... 【 largest value 】when for all

11

Page 12: 符号理論 ...coding theory

some more entropies

joint entropy

12

conditional entropy

𝐻 ( 𝑋 ,𝑌 )= ∑𝑥∈𝐷 (𝑋 )

∑𝑦∈𝐷 (𝑌 )

−𝑃 𝑋 ,𝑌 (𝑥 , 𝑦 ) log2𝑃 𝑋 ,𝑌 (𝑥 , 𝑦 ) .

𝐻 ( 𝑋|𝑌 )= ∑𝑦∈𝐷(𝑌 )

𝑃𝑌 (𝑦 ) ∑𝑥∈𝐷(𝑋 )

−𝑃 𝑋∨𝑌 (𝑥|𝑦 ) log2𝑃 𝑋∨𝑌 (𝑥|𝑦 )

if and are independent, then

Page 13: 符号理論 ...coding theory

mutual information

mutual information between and

13

𝐻 (𝑋 )

𝐻 (𝑌∨𝑋)𝐻 (𝑌 )

𝐻 (𝑋∨𝑌 )

𝐻 (𝑋 ,𝑌 )

𝐼 (𝑋 ;𝑌 )

𝐼 (𝑋 ;𝑌 )¿𝐻 ( 𝑋 )−𝐻 (𝑋|𝑌 )¿𝐻 (𝑌 )−𝐻 (𝑌|𝑋 )¿𝐻 ( 𝑋 )+𝐻 (𝑌 ) −𝐻 (𝑋 ,𝑌 )

if and are independent𝐻 (𝑋 )

𝐻 (𝑌∨𝑋 )𝐻 (𝑌 )

𝐻 (𝑋∨𝑌 )

𝐻 (𝑋 ,𝑌 )

𝐼 (𝑋 ;𝑌 )

if & are independent:

Page 14: 符号理論 ...coding theory

example

binary symmetric channel (BSC) is transmitted is received

14

𝑋 𝑌0

1

0

1

𝑝

1−𝑝𝑝

1−𝑝

compute , assuming

for simplicity, define a binary entropy function

Page 15: 符号理論 ...coding theory

example solved

15

compute , assuming

10

𝑝 (1−𝑞)(1−𝑝 )𝑞

(1−𝑝 )(1−𝑞)𝑝𝑞

0 1𝑋𝑌𝑞1−𝑞

𝑝+𝑞−2𝑝𝑞 1−𝑝−𝑞+2𝑝𝑞𝑃 𝑋 ,𝑌 (𝑥 , 𝑦) 𝑃𝑌 (𝑦 )𝑃 𝑋 (𝑥)

𝐻 ( 𝑋 )=ℋ (𝑞) 𝐻 (𝑌 )=ℋ (𝑝+𝑞−2𝑝𝑞)𝐻 ( 𝑋 ,𝑌 )=− (1−𝑝 )𝑞 log2 (1−𝑝 )𝑞−𝑝𝑞 log2𝑝𝑞

−𝑝 (1−𝑞) log 2𝑝 (1−𝑞 )−(1−𝑝 )(1−𝑞) log2(1−𝑝 )(1−𝑞)

𝐼 ( 𝑋 ;𝑌 )=𝐻 ( 𝑋 )+𝐻 (𝑌 ) −𝐻 ( 𝑋 ,𝑌 )=ℋ (𝑝+𝑞−2𝑝𝑞) −ℋ (𝑝 )¿ℋ (𝑝 )+ℋ (𝑞)

Page 16: 符号理論 ...coding theory

good input and bad input

is a channel-specific constant is a controllable parameter

... the channel works poorly for input with or

... the channel works finely for input with

16

𝑋 𝑌0

1

0

1

𝑝

1−𝑝𝑝

1−𝑝

00.

080.

160.

240.

32 0.4

0.48

0.56

0.64

0.72 0.

80.

880.

96

𝑞

1−ℋ (𝑝 )

0

𝐼 (𝑋 ;𝑌 )

Page 17: 符号理論 ...coding theory

channel capacity

channel capacity = maximum of with = input to the channel = output from the channel

the channel capacity of BSC =

the channel capacity of a binary erasure channel = ,where is the probability of erasure

Channel capacities of some practical channels are also studied.

17

Page 18: 符号理論 ...coding theory

source coding

A source coding is to give a representation of information.The representation must be as small (short) as possible.

18

121464253… 00110101…

encoder

codewords

Page 19: 符号理論 ...coding theory

problem formulation

source symbol construct a code ,

where is a sequence (over )that is called a codeword for

our goal is to construct C so that is immediately decodable, andthe average codeword length of ,

is as small as possible.

19

code sourcesymbols

Page 20: 符号理論 ...coding theory

Huffman code

Code construction by iterative tree operations

1. prepare isolated nodes, each attached witha probability of a symbol (node = size-one tree)

2. repeat the following operation until all trees are joined to onea. select two trees and having the smallest probabilitiesb. join and by introducing a new parent nodec. the sum of probabilities of and is given to the new tree

20

David Huffman1925-1999

Page 21: 符号理論 ...coding theory

construction example

21

ABCDE

prob.0.20.10.30.30.1

codewords

Page 22: 符号理論 ...coding theory

source coding theorem

Shannon’s source coding theorem:There is no immediately decodable code with .

proof by Kraft’s inequality and Shannon’s lemma

We can construct an immediately decodable code with for any small .

construction of a block Huffman code

22

... two faces of source coding

Page 23: 符号理論 ...coding theory

“block” coding

23

AAABACADAEBABBBCBDBECA:

prob.0.040.020.060.060.020.040.010.030.030.010.06

:

codewordsABCDE

prob.0.20.10.30.30.1

Page 24: 符号理論 ...coding theory

problems of block Huffman code

The optimum code is obtained by grouping several symbols into one, andapplying Huffman code construction

practical problems arise:we need much storagewe need to know the probability distribution in advance

...solutions to these problems are discussed in this class.

24

Page 25: 符号理論 ...coding theory

channel coding

Errors are unavoidable in communication.

25

ABCABC ABCADC

Some errors are correctable by adding some redundancy.

ABC Alpha, Bravo, Charlie

ABCAlpha, Bravo, Charlie

Channel coding gives a clever way to introduce the redundancy.

Page 26: 符号理論 ...coding theory

linear code

linear code: practical class of channel codes

the encoding is made by using a generator matrix codeword

the decoding is made by using a parity check matrix syndrome The syndrome indicates the position of errors.

26

Page 27: 符号理論 ...coding theory

Hamming code

To construct a one-bit error correcting code,let column vectors of parity check matrix all different.

Hamming codedetermine a parameter enumerate all nonzero vectors with length use the vectors as columns of

27

Richard Hamming1915-1998

𝐻=(111010011010101011001) 𝐺=(

1000111010011000101010001011

)transpose

Page 28: 符号理論 ...coding theory

parameters of Hamming code

Hamming codedetermine design to have different column vectors

has rows and columnscode length# of information symbols# of parity symbols

28

234567

37

153163

127

14

112657

120code rate =

Page 29: 符号理論 ...coding theory

code rate and performance

If code rate = is large...more information in one codewordless number of symbols for error correctionThe error-correcting capability is weak in general.

29

code ratesmall large

erro

rca

pabi

lity

weak

strong

To have good error-correcting capability,we need to sacrifice the code rate...

Page 30: 符号理論 ...coding theory

channel coding theorem

Shannon’s channel coding theorem:Let be the capacity of the communication channel.

Among channel codes with rate ,there exists a code which can correct almost all errors.

There is no such codes in the class of codes with rate .

30

... two faces of channel coding

Page 31: 符号理論 ...coding theory

two coding theorems

source coding theorem:constructive solution given by Huffman code almost finished work

channel coding theoremno constructive solutiona number of studies have been madestill under investigation

remarkable classes of channel codesproof of the theorem

31

Page 32: 符号理論 ...coding theory

summary

today’s talk ... not self-contained summary of Information Theory

measuring of informationsource codingchannel coding

Students are encouraged to review basics of Information Theory.

32