30
1 Birdsong Recognition 鳥鳥鳥鳥鳥鳥 鳥 鳥 鳥 鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥鳥

Birdsong Recognition 鳥類鳴聲辨識

  • Upload
    azia

  • View
    52

  • Download
    1

Embed Size (px)

DESCRIPTION

Birdsong Recognition 鳥類鳴聲辨識. 李 建 興 中華大學資訊工程學系教授. - PowerPoint PPT Presentation

Citation preview

Page 1: Birdsong Recognition 鳥類鳴聲辨識

1

Birdsong Recognition鳥類鳴聲辨識

李 建 興中華大學資訊工程學系教授

Page 2: Birdsong Recognition 鳥類鳴聲辨識

2

Automatic Classification of Bird Species From Their Sounds Using Two-

Dimensional Cepstral Coefficients

Chang-Hsing Lee, Chin-Chuan Han, and Ching-Chien ChuangIEEE Trans. on Audio, Speech, and Language Processing,

Vol. 16, No. 8, Nov. 2008, pp. 1541-1550.

Page 3: Birdsong Recognition 鳥類鳴聲辨識

3

System Framework

Training syllable

Feature Database

Feature Extraction

LDA

Prototype Vectors Generation

PCA

Classified Bird Species sc

Test syllable

Feature Extraction

LDA Transformation

Classification

PCA Transformation

Page 4: Birdsong Recognition 鳥類鳴聲辨識

4

Feature Extraction

Two-dimensional Mel-frequency cepstral coefficient (TDMFCC)

Time

MFCC

Time

MFCC

DCT TDMFCC

Page 5: Birdsong Recognition 鳥類鳴聲辨識

5

Feature Extraction (cont.)

Dynamic Two-dimensional MFCC ( DTDMFCC )

0

0

0

2

1

))()(()(

n

nn

n

nnini

i

n

jEjEnja

Page 6: Birdsong Recognition 鳥類鳴聲辨識

6

Prototype Vector Generation

Gaussian mixture model (GMM) vs. Vector quantization (VQ)

Acoustic Model Selection – Bayesian information criterion (BIC)

Component Number Selection – self-splitting Gaussian mixture learning (SGML)

Page 7: Birdsong Recognition 鳥類鳴聲辨識

7

Experimental Results

28 bird species Training set – 3143 syllables

Yushan National Park, CD Sound of the Mountain IV: The songs of Wild Birds

Yushan National Park, CD Sound of the Mountain V: The songs of Wild Birds

Test set – 646 syllables Downloaded from website of National Fonghuanggu

Bird Park

Page 8: Birdsong Recognition 鳥類鳴聲辨識

8

Experimental Results (cont.)

Comparison of classification results for different PCA threshold

Page 9: Birdsong Recognition 鳥類鳴聲辨識

9

Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ

OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97

Subject Code Bird Name CA (%) Ns Selected Model

1 Crested Serpent Eagle 100.00 2 EVQ

2 Bronzed Drongo 86.49 5 EVQ

3 Gray-headed Pygmy Woodpecker 0.00 1 EVQ

4 Blue Shortwing 72.41 4 EVQ

5 Streak-breasted Scimitar Babbler 54.55 3 GMM

6 Taiwan Firecrest 100.00 3 EVQ

7 Taiwan Sibia 100.00 6 EVQ

8 White-throated Laughing Thrush 94.59 3 EVQ

9 White-breasted Water Hen 100.00 4 EVQ

10 Beavan's Bullfinch 100.00 3 EVQ

11 Gray-sided Laughing Thrush 100.00 3 EVQ

12 Alpine Accentor 71.70 1 EVQ

13 Green-backed Tit 7.14 5 EVQ

14 Taiwan Yuhina 100.00 3 EVQ

Page 10: Birdsong Recognition 鳥類鳴聲辨識

10

Experimental Results (cont.)SUMMARIZATION OF CLASSIFICATION ACCURACY (CA), SELECTED MODEL (EVQ

OR GMM), THE CLUSTER NUMBER (NS) FOR EACH BIRD SPECIES USING SDTDMFCC WHEN PCA THRESHOLD = 0.97 (cont.)

Subject Code Bird Name CA (%) Ns Selected Model

15 Red-headed Tit 100.00 2 EVQ

16 Collared Bush Robin 94.44 9 EVQ

17 Taiwan Bulbul 83.33 5 EVQ

18 Taiwan Hill Partridge 88.89 6 EVQ

19 Verreaux's Bush Warbler 100.00 4 EVQ

20 Oriental Cuckoo 95.56 3 GMM

21 Taiwan Tit 96.30 7 EVQ

22 Vivid Niltava 100.00 5 EVQ

23 Coal Tit 100.00 4 EVQ

24 Crested Goshawk 100.00 3 EVQ

25 Gould's Fulvetta 33.33 1 EVQ

26 Collared Pigmy Owlet 100.00 1 EVQ

27 Swinhoe's Pheasant 100.00 3 EVQ

28 Steere's Liocichla 80.00 3 EVQ

Page 11: Birdsong Recognition 鳥類鳴聲辨識

11

Continuous Birdsong Recognition Using Gaussian Mixture Modeling of

Image Shape Features

Chang-Hsing Lee, Sheng-Bin Hsu, Jau-Ling Shih, and Chih-Hsun Chou

IEEE Trans. on Multimedia, Vol. 15, No. 2, Feb. 2013, pp. 454-463.

Page 12: Birdsong Recognition 鳥類鳴聲辨識

12

System Framework

Page 13: Birdsong Recognition 鳥類鳴聲辨識

13

Feature Extraction• Angular Radial Transformation (ART) Feature

Page 14: Birdsong Recognition 鳥類鳴聲辨識

14

Feature Extraction (cont.)

Step 1: Spectrogram Generation

Zoom in

Music wave form :

Frame

Overlap

Page 15: Birdsong Recognition 鳥類鳴聲辨識

Spectrum analysis

15

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

frame decomposition

frequency

Page 16: Birdsong Recognition 鳥類鳴聲辨識

16

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

Waveform

Spectrogram

Page 17: Birdsong Recognition 鳥類鳴聲辨識

17

Feature Extraction (cont.)

Step 1: Spectrogram Generation (cont.)

鳳頭蒼鷹(Crested Goshawk)

火冠戴菊鳥 (Taiwan Firecest)

白耳畫眉(Taiwan Sibia)

黃腹琉璃(Vivid

Niltava)

Page 18: Birdsong Recognition 鳥類鳴聲辨識

18

Feature Extraction (cont.)

Step 2: Recognition window segmentation

Page 19: Birdsong Recognition 鳥類鳴聲辨識

19

Feature Extraction (cont.)

Step 3: Sector image generation

Page 20: Birdsong Recognition 鳥類鳴聲辨識

20

Feature Extraction (cont.)

Step 3: Sector image generation (cont.)

uu 256vv 256 sinu

cosv

f256

2562

t

2222 )256()256(256)()(256256 vuvuf

256

256tan

2

256tan

2

256

2

256 11

v

u

v

ut

Page 21: Birdsong Recognition 鳥類鳴聲辨識

21

Feature Extraction (cont.)

Step 4: ART feature extraction

Vn,m(ρ, θ): the ART basis function of order n and m, which is separable along the angular and radial directions:

where

2

0

1

0 ,, ),(),(),(),,(),( ddIVIVmnF SmnSmn

)()(),(, nmmn RAV

jm

m eA2

1)(

0)cos(2

01)(

nn

nRn

Page 22: Birdsong Recognition 鳥類鳴聲辨識

22

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

The 1212 (N = 12 and M = 12) complex ART basis functions (a) real parts of ART basis functions (b) imaginary parts of ART basis functions

Page 23: Birdsong Recognition 鳥類鳴聲辨識

23

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

Page 24: Birdsong Recognition 鳥類鳴聲辨識

24

Feature Extraction (cont.)

Step 4: ART feature extraction (cont.)

Page 25: Birdsong Recognition 鳥類鳴聲辨識

Experimental ResultsCOMMON AND LATIN NAME OF BIRD SPECIES IN THE BIRDSONG DATABASE AND THE NUMBER OF BIRDSONG SEGMENTS IN THE TRAINING SET (NTr) AND TEST SET (NTe) FOR BIRDSONG SEGMENTS OF DIFFERENT DURATIONS (D)

Common Name Latin NameD = 3 seconds D = 5 seconds

NTr NTe NTr NTe

Crested Serpent Eagle Spilornis cheela 107 5 105 3

Bronzed Drongo Dicrurus aeneus 128 10 126 8

Gray-headed Pygmy Woodpecker Dendrocopos canicapillus 50 9 48 7

Blue Shortwing Brachypteryx montana 172 6 170 4

Streak-breasted Scimitar Babbler Pomatorhinus ruficollis 147 16 145 4

Taiwan Firecest Regulus goodfellowin 92 10 90 8

Taiwan Sibia Heterophasia auricularis 97 5 95 3

White-throated Laughing Thrush Garrulax albogularis 61 8 59 6

White-breasted Water Hen Amauromis phoenicurus 83 6 81 4

Beavan's Bullfinch Pyrrhula erythaca 104 3 102 1

Gray-sided Laughing Thrush Garrulax caerulatus 77 79 75 77

Alpine Accentor Prunella collaris 62 9 60 7

Green-backed Tit Parus monticolus 127 4 125 2

Taiwan Yuhina Yuhina brunneiceps 62 6 60 425

Page 26: Birdsong Recognition 鳥類鳴聲辨識

Experimental Results (cont.)Red-headed Tit Aegithalos concinnus 98 9 96 7

Collared Bush Robin Erithacus johnstoniae 147 5 145 3

Taiwan Bulbul Pycnonotus taivanus Styan 58 8 56 6

Taiwan Hill Partridge Arborophila crudigularis 141 10 139 8

Verreaux's Bush Warbler Cettia acanthizoides 72 8 70 6

Oriental Cuckoo Cuculus saturatus 124 10 122 8

Taiwan Tit Parus holsti 116 7 114 5

Vivid Niltava Niltava vivida 91 8 89 6

Colal Tit Parus ater 105 10 103 8

Crested Goshawk Accipiter trivirgatus 113 11 111 9

Gould's Fulvetta Alcippe brunnea 41 7 39 5

Collared Pigmy Owlet Glaucidium brodiei 59 16 57 9

Swinhoe's Pheasant Lophura swinhoii 92 5 90 3

Steere's Liocichla Liocichla steerii 57 6 55 4

Red-headed Tit Aegithalos concinnus 98 9 96 7

Collared Bush Robin Erithacus johnstoniae 147 5 145 3

Total number of birdsong segments 2683 296 2627 22526

Page 27: Birdsong Recognition 鳥類鳴聲辨識

27

Experimental Results (cont.)

Comparison of classification accuracy for different number of GMM Gaussian components (G) and distinct PCA thresholds () using 624 ART basis

functions for the recognition of birdsong segments having distinct durations (D)

Page 28: Birdsong Recognition 鳥類鳴聲辨識

28

Experimental Results (cont.)

Comparison of classification accuracy on distinct ART basis functions (NM) for the classification of birdsong segments having different durations (D) with

fixed number of GMM component (G = 5)

Page 29: Birdsong Recognition 鳥類鳴聲辨識

29

Experimental Results (cont.)

COMPARISON OF VARIOUS FEATURE DESCRIPTORS IN TERMS OF CLASSIFICATION ACCURACY (CA)

DescriptorD = 3 D = 5

CA (%) (G, ) CA (%) (G, )

LPCC 30.41 (50, 0.98/0.99) 40.00 (30, 0.99)

MFCC 46.62 (35, 0.98/0.99) 56.89 (45, 0.95/0.96/0.97)

TDMFCC 69.86 (10, 0.96) 77.13 (5, 0.95)

DTDMFCC 76.03 (5, 0.99) 83.86 (10, 0.99)

SDTDMFCC 73.63 (10, 0.95) 79.82 (10, 0.95/0.96)

ART 86.30 (5, 0.97/0.98) 94.62 (5, 0.95/0.97)

Page 30: Birdsong Recognition 鳥類鳴聲辨識

30

Thanks!