65
Lecture 11: Tutorial Werayut Saesue & Tuan Hue Thi A/Prof. Jian Zhang A/Prof. Jian Zhang NICTA & CSE UNSW Dr. Reji Mathew EE&T UNSW COMP9519 Multimedia Systems S2 2010 [email protected]

Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

  • Upload
    ngomien

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Lecture 11: Tutorial

Werayut Saesue & Tuan Hue Thi

A/Prof. Jian ZhangA/Prof. Jian Zhang

NICTA & CSE UNSW

Dr. Reji Mathew

EE&T UNSW

COMP9519 Multimedia Systems S2 2010

[email protected]

Page 2: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Part I: Video Coding

Technology

Werayut Saesue

A/Prof. Jian ZhangA/Prof. Jian Zhang

NICTA & CSE UNSW

Dr. Reji Mathew

EE&T UNSW

Page 3: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 1

� An RGB image is converted to YUV 4:2:2 format. “The YUV 4:2:2 version of the image is of lower quality than the RGB version of the image”. Is this statement TRUE or FALSE? Give reasons for your answer.

COMP9519 Multimedia Systems – Lecture 7 – Slide 3 – J Zhang

Page 4: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Answer to Question 1

� True. Although the loss in quality may not be visually obvious as only the chrominance components are sub-sampled and the human visual system (HVS) is less sensitive to the chrominance components.

COMP9519 Multimedia Systems – Lecture 7 – Slide 4 – J Zhang

Page 5: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 2

� Shown below are two 8x8 image blocks.

COMP9519 Multimedia Systems – Lecture 7 – Slide 5 – J Zhang

� Answer the following:

� Calculate the entropy for each block.

Page 6: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Answer to Question 2

COMP9519 Multimedia Systems – Lecture 7 – Slide 6 – J Zhang

Page 7: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 3

� Based on 2x2 macroblocks (MB), the X motion estimation algorithm (X-MEN) searches for the best matching motion vector in the following locations: {0,0}, {-2,-2}, {+2,-2}, {-2, +2}, {+2,+2}. These locations are relative to the current MB and corresponds to the center, top-left, top-right, bottom-left and bottom-right areas. Assume a search range of {-2,+2}, with the current MB at the current frame (frame[N]) as:

COMP9519 Multimedia Systems – Lecture 7 – Slide 7 – J Zhang

� Where the shaded MB is the relative position of the current MB in frame[N] on the reference frame (also called the co-located MB). Answer the following questions:

� Calculate the sum of absolute differences at each search location of X-MEN.

� What will be the best motion vector given by X-MEN? Justify your answer.

and the reference frame (it is shown in next slide)

Page 8: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 3

� The reference frame [N-1]

COMP9519 Multimedia Systems – Lecture 7 – Slide 8 – J Zhang

Page 9: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Answer to Question 3

COMP9519 Multimedia Systems – Lecture 7 – Slide 9 – J Zhang

Page 10: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 4

� The video encoding process blocks are shown below

COMP9519 Multimedia Systems – Lecture 7 – Slide 10 – J Zhang

� Which block(s) will information loss occur?

� Which block(s) contain the decoded version of the previous frame (Frame[N-1])?

� Which block(s) contain the motion compensated version of the current frame (Frame[N])?

Page 11: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Answer to Question 4

COMP9519 Multimedia Systems – Lecture 7 – Slide 11 – J Zhang

Page 12: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 5

� An 8x8 image block is given below. (i) Transform this block using the 2D-DCT, (ii) perform quantization using a step size of 8 for all transformed coefficients, (iii) perform zig-zag scanning of the quantized coefficients to obtain (run, level) pairs, (iv) perform inverse quantization, (v) perform 2D-IDCT, (vi) calculate MSE of the final

COMP9519 Multimedia Systems – Lecture 7 – Slide 12 – J Zhang

perform 2D-IDCT, (vi) calculate MSE of the final inverse quantized, inverse transformed block.

� Answer:

� Steps discussed in class – during tutorial lecture

Page 13: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 6

� Assume you have a video sequence coded in the following pattern IBBPBBPBBPBBIBBPBBPBBPBBI

� (i) If the second I frame is corrupted with error (i.e. the I frame in the middle of the above sequence), how many other frames can be degraded due to error propagation?

� You can assume, for example, that a portion of the data for the second I frame is missing (i.e due to lost packets in a streaming application).

COMP9519 Multimedia Systems – Lecture 7 – Slide 13 – J Zhang

in a streaming application).

� Answer:� IBBPBBPBBPBB[I]BBPBBPBBPBBI� All the frames in BOLD will be affected by the

corrupt I frame (shown in square brackets ).� Explanation given during tutorial lecture.

Page 14: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 7

� Similarly what would be the effect of error propagation if the first B frame is corrupted with error ?

� Answer:

� I[B]BPBBPBBPBBIBBPBBPBBPBBI

� No other frames are affected.

COMP9519 Multimedia Systems – Lecture 7 – Slide 14 – J Zhang

� B frames are not used as a reference for predictive coding.

Page 15: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 8

� How can such error propagation be limited for an MPEG-4 coded bit stream?

� Answer:

� More I frames

� Intra block refresh schemes

COMP9519 Multimedia Systems – Lecture 7 – Slide 15 – J Zhang

� Intra block refresh schemes

� Other options ?

� Refer to discussion during the tutorial lecture

Page 16: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Video Coding Technology

Question 9

� How can scalable coding help with error resilience in a video streaming application?

� Assume you have spatial scalable coding with two layers (base layer and enhancement layer) and that video is being streamed live (ie IPTV).

� Answer:A base layer provides basic (low quality service)

COMP9519 Multimedia Systems – Lecture 7 – Slide 16 – J Zhang

� A base layer provides basic (low quality service)� With the enhancement layer can achieve full resolution

� Unequal error protection schemes for the two layers

� ensuring a basic service.

� Other options ?

� Refer to discussions during the tutorial lecture

Page 17: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Part II: Media Streaming

Werayut Saesue

A/Prof. Jian ZhangA/Prof. Jian Zhang

NICTA & CSE UNSW

Dr. Reji Mathew

EE&T UNSW

Page 18: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Question 1

� Consider a video-on-demand system designed for streaming video and audio content over the internet. What problems would you encounter, if you tried to stream the video and audio content directly over UDP/IP without using RTP?

COMP9519 Multimedia Systems – Lecture 7 – Slide 18 – J Zhang

UDP/IP without using RTP?

Page 19: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Answer to Question 1

� Cannot detect lost packets

� Unable to reorder received packets

� Unable to do audio-video stream synchronization

COMP9519 Multimedia Systems – Lecture 7 – Slide 19 – J Zhang

RTP provides:

� Pay load identification, and

� Frame indication to support applications / decoders

Page 20: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Question 2

� Consider streaming of a live event from a video source (i.e. sender) to a client (i.e. receiver) using RTP/RTCP. After some time of successful operation, the video source ignores all RTCP receiver reports (RR) from the client. What problems could

COMP9519 Multimedia Systems – Lecture 7 – Slide 20 – J Zhang

(RR) from the client. What problems could arise if all RR are ignored by the video source?

Page 21: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Answer to Question 2

� Not enough information to adapt to network

RTCP Receiver Report (RR)

� Sent by receivers (not active senders)

� Provides a report of reception statistics

COMP9519 Multimedia Systems – Lecture 7 – Slide 21 – J Zhang

� Provides a report of reception statistics

Page 22: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Question 3

� What functionality does RTSP provide?

� What transport protocol is specified (if any) for RTSP?

COMP9519 Multimedia Systems – Lecture 7 – Slide 22 – J Zhang

Page 23: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Answer to Question 3

� Establishes and controls one or more continuous media streams

� Provides “Internet VCR controls”

� Transport-independent

COMP9519 Multimedia Systems – Lecture 7 – Slide 23 – J Zhang

� Transport-independent

Page 24: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Question 4

� In a peer-to-peer video conferencing situation what tools or techniques are available to achieve better resilience to packet loss?

COMP9519 Multimedia Systems – Lecture 7 – Slide 24 – J Zhang

Page 25: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Answer to Question 4

� Receiver

� Error concealment

� Fill in missing blocks

COMP9519 Multimedia Systems – Lecture 7 – Slide 25 – J Zhang

Page 26: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Media Streaming

Answer to Question 4

� Sender

� Adaptive rate control

� Packetization strategy

� Error resilience

COMP9519 Multimedia Systems – Lecture 7 – Slide 26 – J Zhang

� increase intra-frames

� intra-block refresh

� introduce redundancies

Page 27: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Part III: Multimedia

Information Retrieval

Tuan Hue Thi

A/Prof. Jian ZhangA/Prof. Jian Zhang

NICTA & CSE UNSW

Page 28: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Tutorial – Question 1

1 0 1 0 1

1 0 1 0 1

1 1 1 1 1

Consider an image with 2 distinct grey-levels

COMP9519 Multimedia Systems – Lecture 7 – Slide 28 – J Zhang

1 0 1 0 1

1 0 1 0 1

Calculate the co-occurrence matrix if the position operator is defined as

“one pixel to the left”

Page 29: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Revision on Texture

� The Grey-level Co-occurrence Matrix (GLCM) can be

used to calculate the second-order statistics.

� Given the following 2x2 pixel image with 2 distinct grey-

levels:

1 0

COMP9519 Multimedia Systems – Lecture 7 – Slide 29 – J Zhang

1 0

0 1

And the position operator defined as “one pixel to the left”

Page 30: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Revision on co-occurrence matrix

� The 2x2 co-occurrence matrix can be calculated as follows:-

Gray-Levels at current pel

0 1

Gray-Levels at left

0 N00 N01

1 N10 N11

COMP9519 Multimedia Systems – Lecture 7 – Slide 30 – J Zhang

1 N10 N11

� N00 = the number of pixels with grey-level 0 that have a gray-level of 0 one pixel to the left

� N01 = the number of pixels with grey-level 1 that have a gray-level of 0 one pixel to the left

� N10 = the number of pixels with grey-level 0 that have a gray-level of 1 one pixel to the left

� N11 = the number of pixels with grey-level 1 that have a gray-level of 1 one pixel to the left

Page 31: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Revision on co-occurrence matrix

10

01

Position Operator (One pixel to the left) can not be defined

Grey-level of 0 with grey level of 1 one pixel to the left

COMP9519 Multimedia Systems – Lecture 7 – Slide 31 – J Zhang

10

01

Current Pixel

Grey-level of 0 with grey level of 1 one pixel to the left

Page 32: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Revision on co-occurrence matrix

Hence, the final co-occurrence matrix will be equal to

10

01

Current Pixel

Grey-level of 1 with grey level of 0 one pixel to the left

COMP9519 Multimedia Systems – Lecture 7 – Slide 32 – J Zhang

Hence, the final co-occurrence matrix will be equal to

Gray-Levels at current pel

0 1

Gray-Levels at left

0 0 1

1 1 0

Page 33: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 1

� Step 1 – The size of the co-occurrence matrix will be equal to the number of distinct grey-levels x the number of distinct grey-levels (L x L)

� Since, the image has 2 distinct grey-levels, the size of the co-occurrence matrix is 2 x 2.

COMP9519 Multimedia Systems – Lecture 7 – Slide 33 – J Zhang

Gray-Levels at current pel

0 1

Gray-Levels at left

0

1

Page 34: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 2

� Step 2 – Count the number of pixels in the image which have the relationship between itself and its neighbors specified by position operator.

� The position operator in our question is defined as “one pixel to the left”

COMP9519 Multimedia Systems – Lecture 7 – Slide 34 – J Zhang

1 0 1 0 1

1 0 1 0 1

1 1 1 1 1

1 0 1 0 1

1 0 1 0 1

N00 = the number of pixels with grey-level 0that have a gray-level of 0 one pixel to the left

= 0

Page 35: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 2

� Step 2 – Count the number of pixels in the image which have the relationship between itself and its neighbors specified by position operator.

� The position operator in our question is defined as “one pixel to the left”

COMP9519 Multimedia Systems – Lecture 7 – Slide 35 – J Zhang

1 0 1 0 1

1 0 1 0 1

1 1 1 1 1

1 0 1 0 1

1 0 1 0 1

N01 = the number of pixels with grey-level 1that have a gray-level of 0 one pixel to the left

= 8

Page 36: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 2

� Step 2 – Count the number of pixels in the image which have the relationship between itself and its neighbors specified by position operator.

� The position operator in our question is defined as “one pixel to the left”

COMP9519 Multimedia Systems – Lecture 7 – Slide 36 – J Zhang

1 0 1 0 1

1 0 1 0 1

1 1 1 1 1

1 0 1 0 1

1 0 1 0 1

N10 = the number of pixels with grey-level 0that have a gray-level of 1 one pixel to the left

= 8

Page 37: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 2

� Step 2 – Count the number of pixels in the image which have the relationship between itself and its neighbors specified by position operator.

� The position operator in our question is defined as “one pixel to the left”

COMP9519 Multimedia Systems – Lecture 7 – Slide 37 – J Zhang

1 0 1 0 1

1 0 1 0 1

1 1 1 1 1

1 0 1 0 1

1 0 1 0 1

N11 = the number of pixels with grey-level 1that have a gray-level of 1 one pixel to the left

= 4

Page 38: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 1 – Step 3

� Step 3 – Fill in the answers calculated from Step 2

� Hence, the final co-occurrence matrix will be:-

Gray-Levels at current pel

COMP9519 Multimedia Systems – Lecture 7 – Slide 38 – J Zhang

0 1

Gray-Levels at left

0 0 8

1 8 4

Page 39: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Question 2

1 2 1 3 3

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 2 1 2 1

2 2 1 2 1

3 1 1 1 1

3 3 3 3 3

Consider two images A and B, and histogram similarity matrix C

1 0.5 0

0.5 1 0.5

COMP9519 Multimedia Systems – Lecture 7 – Slide 39 – J Zhang

1 3 1 2 2 2 2 2 3 3

A B

0 0.5 1

C

• For each image, calculate histogram, cumulative histogram and CCV

• For the two images, calculate L1 histogram distance, L1 cumulative

histogram distance, histogram intersection, Normalized CCV distance and

Niblack’s histogram similarity value.

• Assume that average filtering has already been applied to the image.

• Suppose that the threshold for the size of the connected component is 3.

Page 40: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision - Histogram

� A histogram is a graphical display of tabulated frequencies.

1 2 1 3 3

1 2 1 2 3

Grey-Level

Frequency

1 11

COMP9519 Multimedia Systems – Lecture 7 – Slide 40 – J Zhang

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 3 1 2 2

1 11

2

3

Page 41: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision - Histogram

� A histogram is a graphical display of tabulated frequencies.

1 2 1 3 3

1 2 1 2 3

Grey-Level

Frequency

1 11

COMP9519 Multimedia Systems – Lecture 7 – Slide 41 – J Zhang

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 3 1 2 2

1 11

2 8

3

Page 42: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision - Histogram

� A histogram is a graphical display of tabulated frequencies.

1 2 1 3 3

1 2 1 2 3

Grey-Level

No. of Observations

COMP9519 Multimedia Systems – Lecture 7 – Slide 42 – J Zhang

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 3 1 2 2

1 11

2 8

3 6

Page 43: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision - Histogram

� A histogram is a graphical display of tabulated frequencies.

Grey-Level

Frequency

1 11 8

10

12

COMP9519 Multimedia Systems – Lecture 7 – Slide 43 – J Zhang

1 11

2 8

3 60

2

4

6

8

Grey-

Level

of 1

Grey-

Level

of 2

Grey-

Level

of 3

Frequency

Page 44: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2 - Histogram

1 2 1 3 3

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 3 1 2 2

1 2 1 2 1

2 2 1 2 1

3 1 1 1 1

3 3 3 3 3

2 2 2 3 3

COMP9519 Multimedia Systems – Lecture 7 – Slide 44 – J Zhang

1 3 1 2 2 2 2 2 3 3

Grey-Level Frequency

1 11

2 8

3 6

Grey-Level Frequency

1 9

2 8

3 8

Image BImage A

Histogram of Image A Histogram of Image B

Page 45: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Revision – Cumulative Histogram

� A cumulative histogram is a mapping that counts the cumulative number of observations in all of the bins up to the specified bin.

Grey-Level No. of

Histogram

Grey-Level No. of

Cumulative Histogram

COMP9519 Multimedia Systems – Lecture 7 – Slide 45 – J Zhang

Grey-Level No. of Observations

1 11

2 8

3 6

Grey-Level No. of Observations

1 11

2 11+8 = 19

3 19+6 = 25

Page 46: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2 : Cumulative Histogram

Grey-Level

Frequency

1 11

2 8

3 6

Grey-Level

Frequency

1 9

2 8

3 8

COMP9519 Multimedia Systems – Lecture 7 – Slide 46 – J Zhang

Grey-Level

Frequency

1 11

2 19

3 25

Grey-Level

Frequency

1 9

2 17

3 25

Histogram of image A Histogram of image B

Cumulative Histogram of image A Cumulative Histogram of image B

Page 47: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision: Similarity Measurement-1

� Example 2 – Niblack’s similarity measurement

tz

X – the query histogram; Y – the histogram of an image in the database

The Similarity between X and Y �

Where A is a symmetric color similarity matrix with a(i,j) for similarity between j-th

and i-th color bin in the color histrogram

T

AIHAIHIHd )()(),( −−=

COMP9519 Multimedia Systems – Lecture 7 – Slide 47 – J Zhang

� The similarity matrix A accounts for the perceptual similarity between different pairs of colors.

A can be diagonalised, Thus, there exists a discrete color space in which they

are not correlated at all for and .

Where are the eigenvalues of the matrix A. the similarity calculation becomes

Weighted L2 Metric in suitably chosen color space

∑ −=−−==

n

llll

T

AihwIHAIHIHd

1

2)()()(),(

lw

li

lh

Page 48: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision: Similarity Measurement-2

� We can also to have an alternative approach.

X – the query histogram; Y – the histogram of an image in the database

Z – the bin-to-bin similarity histogram. Z is Transpose matrix = {|I-H|}

The Similarity between X and Y �, AZZZ t=||||

COMP9519 Multimedia Systems – Lecture 7 – Slide 48 – J Zhang

The Similarity between X and Y �,

Where A is a symmetric color similarity matrix with a(i,j) = 1 - d(ci,cj)/dmax

ci and cj are the ith and jth color bins in the color histogram

d(ci,cj) is the color distance in the mathematical transform to Munsell

color space and dmax is the maximum distance between any two colors in the

color space.

AZZZ t=||||

Page 49: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – Color Coherence Vector

� Color coherence vector (CCV) is a tool to distinguish images whose color histograms are indistinguishable.

� Coherence measure classifies pixels as either coherent or incoherent.

� 4 Steps to compute CCV

� Step1: conduct average filtering

COMP9519 Multimedia Systems – Lecture 7 – Slide 49 – J Zhang

� Step1: conduct average filtering

� Step2: discretize the image into n distinct colors

� Step3: Classify the pixels within a given color bucket as either coherent or incoherent – a pixel is coherent if the size of the connected component exceeds a fixed value.

� Step4: Obtain CCV by collecting the information of both coherent and incoherent.

Page 50: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – CCV (Step 1,2)

� Step1 – Since average filter has already been applied, this step will be skipped.

� Step2 – Discretize the image

� We discretize both images into three distinct colors

COMP9519 Multimedia Systems – Lecture 7 – Slide 50 – J Zhang

Page 51: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – CCV (Step 3)

� Step3 – Classify the pixels as either coherent or incoherent.

1 2 1 3 3

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

A B A D D

A B A E D

A A A E E

A C A F E

COMP9519 Multimedia Systems – Lecture 7 – Slide 51 – J Zhang

1 3 1 3 2

1 3 1 2 2

Discretized Image

A C A F E

A C A E E

Connected Component

Label A B C D E F

Color 1 2 3 3 2 3

Size 11 2 2 3 6 1

Connected Table

Page 52: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – CCV (Step 3)

� Step3 – Classify the pixels as either coherent or incoherent – A pixel is coherent if the size of the connected component exceeds a fixed value of 3; otherwise, the pixel is incoherent.

Label A B C D E F

Color 1 2 3 3 2 3

Color 1 2 3

α 11

COMP9519 Multimedia Systems – Lecture 7 – Slide 52 – J Zhang

Color 1 2 3 3 2 3

Size 11 2 2 3 6 1

Connected Table

β 0

Color Coherent Vector

Page 53: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – CCV (Step 3)

� Step3 – Classify the pixels as either coherent or incoherent – A pixel is coherent if the size of the connected component exceeds a fixed value of 3; otherwise, the pixel is incoherent.

Label A B C D E F

Color 1 2 3 3 2 3

Color 1 2 3

α 11 6

COMP9519 Multimedia Systems – Lecture 7 – Slide 53 – J Zhang

Color 1 2 3 3 2 3

Size 11 2 2 3 6 1

Connected Table

β 0 2

Color Coherent Vector

Page 54: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision – CCV (Step 3)

� Step3 – Classify the pixels as either coherent or incoherent – A pixel is coherent if the size of the connected component exceeds a fixed value of 3; otherwise, the pixel is incoherent.

Label A B C D E F

Color 1 2 3 3 2 3

Color 1 2 3

α 11 6 3

COMP9519 Multimedia Systems – Lecture 7 – Slide 54 – J Zhang

Color 1 2 3 3 2 3

Size 11 2 2 3 6 1

Connected Table

β 0 2 3

Color Coherent Vector

Page 55: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2 - CCV

1 2 1 3 3

1 2 1 2 3

1 1 1 2 2

1 3 1 3 2

1 3 1 2 2

1 2 1 2 1

2 2 1 2 1

3 1 1 1 1

3 3 3 3 3

2 2 2 3 3

Image BImage A

COMP9519 Multimedia Systems – Lecture 7 – Slide 55 – J Zhang

Color 1 2 3

α 11 6 3

β 0 2 3

Color Coherent Vector of

Image A

Color 1 2 3

α 8 6 8

β 1 2 0

Color Coherent Vector of

Image B

Page 56: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2

Bin 1 2 3

Histogram 11 8 6

Cumulative

Histogram11 19 25

Bin 1 2 3

Histogram 9 8 8

Cumulative

Histogram9 17 25

COMP9519 Multimedia Systems – Lecture 7 – Slide 56 – J Zhang

CCVα 11 6 3

β 0 2 3

Image A

CCVα 8 6 8

β 1 2 0

Image B

Page 57: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2

� L1 Histogram DistanceD = |11-9| + |8-8| + |6-8| = 4

� L1 Cumulative Histogram DistanceD = |11-9| + |19-17| + |25-25| = 4

� L2 Histogram Distance

COMP9519 Multimedia Systems – Lecture 7 – Slide 57 – J Zhang

� Histogram IntersectionD = [min(11,9) + min(8,8) + min(6,8)] / (11 + 8 + 6)= [9 + 8 + 6] / Min(25,25)= 23 / 25= 0.92

83.28)86()88()911( 222 ==−+−+−=D

Page 58: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 2

� Normalized CCV

D = |(11-8)/(11+8+1)| + |(0-1)/(0+1+1)|

+ |(6-6)/(6+6+1)| + |(2-2)/(2+2+1)|

+ |(3-8)/(3+8+1)| + |(3-3)/(3+3+1)|

= 3/20 + 1/2 +5/12

= 1.1817

COMP9519 Multimedia Systems – Lecture 7 – Slide 58 – J Zhang

� Niblack’s similarity measure

Transpose(Z) = [|11-9|, |8-8|, |6-8|] = [2, 0, 2]

8]202[

15.00

5.015.0

05.01

]202[ =

== TTAZZD

Page 59: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Question 3

Consider the boundary and the numbering schemes

COMP9519 Multimedia Systems – Lecture 7 – Slide 59 – J Zhang

• Digitize the boundary

• Select the red node as starting point, calculate the chain code in

counter-clockwise direction

• Calculate the normalized chain code by using the first difference of

the chain code

Page 60: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision– Chain Code

� Revision on Chain Code

� Represent a boundary by a connected sequence of straight-line

segments of specified length and direction

� Based on 4- or 8- connectivity

� Depends on the starting point and the spacing of the sampling

grid

COMP9519 Multimedia Systems – Lecture 7 – Slide 60 – J Zhang

Page 61: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Revision– Chain Code

� Revision on Chain Code

� Steps to calculate chain code

� Digitize image

� Decide a sampling gird, the denser the more accurate

� Assign boundary point to grid node based on the distance of the grid node to the boundary, d < threshold � assign boundary point to the node

� Select a numbering scheme and define a starting point

COMP9519 Multimedia Systems – Lecture 7 – Slide 61 – J Zhang

� Follow the boundary in a specified direction (clockwise or counter

clockwise), assign a direction to the segments connecting neighboring

boundary points

d

2

0

2

6

175

3

Page 62: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 3

� Steps to solve question 3

1. Digitize input boundary

� Straightforward since the sampling grid and distance threshold are

given as the rectangular grid and the dotted circles

COMP9519 Multimedia Systems – Lecture 7 – Slide 62 – J Zhang

Page 63: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 3� Steps to solve question 3

2. Boundary following is a bit more complex for 4-connectivity

(starting point given as the red point, direction given as counter clockwise)

Some ancillary boundary points have to be added

23

1

22

COMP9519 Multimedia Systems – Lecture 7 – Slide 63 – J Zhang

The chain code is 233300011122

33

0 0 0

11

Page 64: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 3

� Steps to solve question 3

� The adding of ancillary points should be consistent with the direction

of boundary following

2 22

COMP9519 Multimedia Systems – Lecture 7 – Slide 64 – J Zhang

33

3

0 0 0

11

1

Page 65: Lecture 11: Tutorial - cse.unsw.edu.aucs9519/lecture_notes_10/L11_COMP9519.pdf · Steps discussed in class –during tutorial lecture. Video Coding Technology Question 6 ... The position

Multimedia Information Retrieval

Answer to Question 3� Steps to solve question 3

2. Normalize for rotation by using the first difference of the 4-direction chain code

� The difference is obtained by counting the number of direction changes that separate two adjacent elements of the code.

2

3

1

22

COMP9519 Multimedia Systems – Lecture 7 – Slide 65 – J Zhang

The chain code is 233300011122

The first difference of the chain code is 10010010010

33

0 0 0

11