19
111/08/23 1 Query by Tapping 敲敲敲敲 J.-S. Roger Jang ( 敲敲敲 ) Multimedia Information Retrieval Lab CS Dept., Tsing Hua Univ., Taiwan http://mirlab.org/jang

Query by Tapping 敲擊選歌

  • Upload
    dustin

  • View
    78

  • Download
    0

Embed Size (px)

DESCRIPTION

Query by Tapping 敲擊選歌. J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept., Tsing Hua Univ., Taiwan http://mirlab.org/jang. Query by Tapping. Goal: Music search based on uses’ tapping (at notes’ onsets) over the microphone/keyboard Characteristics - PowerPoint PPT Presentation

Citation preview

Page 1: Query by Tapping 敲擊選歌

112/04/22 1

Query by Tapping敲擊選歌

J.-S. Roger Jang ( 張智星 )Multimedia Information Retrieval Lab

CS Dept., Tsing Hua Univ., Taiwan

http://mirlab.org/jang

Page 2: Query by Tapping 敲擊選歌

-2-

Query by Tapping

Goal: Music search based on uses’ tapping (at notes’

onsets) over the microphone/keyboardCharacteristics

Only note duration is used for comparison, note pitch is discarded.

A hard task for human to recognize (which is different from query by singing/humming)Try this…

Page 3: Query by Tapping 敲擊選歌

-3-

Query by Tapping

Goal: Music search based on uses’ tapping (at notes’

onsets) over the microphone/keyboardCharacteristics

Only note duration is used for comparison, note pitch is discarded.

A hard task for human to recognize (which is different from query by singing/humming)Try this…

Page 4: Query by Tapping 敲擊選歌

-4-

Query by Tapping

Challenges: Users is unlikely to use the same tempo as the

intended song Users tend to lose notes instead of gaining ones We have about 13,000 songs in the database

Major approach: A distance measure based on dynamic

programming

Page 5: Query by Tapping 敲擊選歌

-6-

Feature Extraction via Microphone

Microphone input:

After frame blocking, energy computation, and thresholding:

Page 6: Query by Tapping 敲擊選歌

-7-

Performance Evaluation of Onset Detection

simSequence.m

0 1 2 3 4 5 6

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Computed

GT

precision=3/6=0.5recall=3/5=0.6f-measure=2pr/(p+r)=0.5455

Page 7: Query by Tapping 敲擊選歌

-8-

Similarity Comparison with Songs in Database

A fast method based on IOI ratios Compute the IOI ratios for both query and db IOI

vectors Compute the Euclidean distance these two ratio

vectors

Page 8: Query by Tapping 敲擊選歌

-9-

Music Note Alignment

t(3)

t: test (input) IOI vectorr: reference IOI vector

r(1)t(1)

t(2)r(2)

r(3)

NormalizationAlignment

by DP

t r t r t r

Page 9: Query by Tapping 敲擊選歌

-10-

Normalization

Normalization to have

(Multiplication of 1000 to guarantee high resolution in fixed-point computation.)

)~,~(min),(2~2

qppq

rtDrtdist

))):1((/):1(*1000(~))(/*1000(~

qrsumqrroundr

tsumtroundt

q

1000)(~)(~

11

q

jq

p

i

irit

Page 10: Query by Tapping 敲擊選歌

-11-

Dynamic-programming-based Distance

i

j

t(i-2)

r(j-1)

1~1,0),1(

1~1,0)1,(

)1()1()2()1,2(

)1()1()1,1(

)1()1()2()2,1(

min

),(

2

1

njjD

miiD

jrititjiD

jritjiD

itjrjrjiD

jiD

),( jiD

t: test IOI vector of length mr: reference IOI vector of length n

Recurrent relation:

r(j-2)

t(i-1)t(1) t(2)

r(1)

r(2)

Page 11: Query by Tapping 敲擊選歌

-12-

Experimental Environment

269 test wave files of tapping clips 9 contributors (7 males, 2 females) Wave length: 15 seconds Wave format: PCM, 11025Hz, 8bits, Mono Start position: Beginning of a song

Environment Pentium III 800, 256MB RAM

Database 11,744 MIDI files

Page 12: Query by Tapping 敲擊選歌

-13-

Test Results Using Clips of 15 Seconds

Average response time: 3.42 seconds (29.98 notes)

Recognition rates: Top-1 (top 0.0085%): 15% Top-10 (top 0.085%): 51% Top-100 (top 0.85%): 80%

Page 13: Query by Tapping 敲擊選歌

-14-

Error Analysis

Errors analysis of low-ranked clips Some users cannot tap consistently through 15

seconds Feature extraction is not robust enough to handle

noisy input. Some MIDI files are not faithful rendition of the

original tunes. Users cannot keep up with short consecutive notes.

Page 14: Query by Tapping 敲擊選歌

-15-

Recog. Rates w.r.t. Tapping Duration

Top-100 and 1000 curves level off after 10 seconds.

Top-100 curve does not go up monotonically.

Top-100

Top-10

Top-1000

Page 15: Query by Tapping 敲擊選歌

-16-

Demo

No. of MIDI files: 12982

Page 16: Query by Tapping 敲擊選歌

-17-

Partial List of Songs All I have to do is dream You are my sunshine Beautiful Sunday Do Re Mi Feelings A time for us Love is blue Let it be me My way Love story More than I can say Only you Rain and tears

Rhythm of the rain Rose Rose I love you The sound of silence Unchained melody We are the world Yesterday I just call to say I love you Close to you Mr. Lonely Ben Hey Jude Donna Donna Sealed with a kiss

Page 17: Query by Tapping 敲擊選歌

-18-

Potential Applications

Interactive toysBeat-tracking training and gamesSong retrieval in noisy karaoke bars

Page 18: Query by Tapping 敲擊選歌

-19-

Conclusions

Our MIR system is the first one with query-by-tapping capability.

Rhythm-based search can be used in conjunction with pitch-contour-based search to achieve a better recognition rate.

Page 19: Query by Tapping 敲擊選歌

-20-

Future Work

Search scope expansion How to retrieve MP3 or CD music directly?

Scale-up by hierarchical filtering method How to deal with database with 100,000 songs? What if the user tap from anywhere in the middle

of a song?