Upload
anonymous-chf3xejjr
View
239
Download
0
Embed Size (px)
Citation preview
8/11/2019 Thut Ton Trch c Trng MFCC
1/29
1.1 Thut ton trch c trng MFCC thng thng
S khi thut ton trch c trng MFCC thng thng c ch ra
trongHnh 1.1-1.N c t tn nh vy do lm cong vnh tn sphi tuyn theo
thang tn sMel. Trc ht, ting ni c ly mu bi mt bchuyn i A/D.
Mc d tai ngi c thnghe c m thanh vi tn st20Hz ti 20KHz, nhng
ting ni thng thng ch khong tn s di 5KHz, v vi cht lng m
thanh thoi th s c bng thng gii hn l 4KHz. Vi l do ny, chng ti s
dng bng thng 4KHz trong ti ny v tn sly mu y l 8KHz.
Pre-emphasisFrame
BlockingWindowing |FFT|
Mel
Frequency
Filter bank
Cepstrum
Logged
energyDelta
Speechsamples
MFCC
Hnh 1.1-1 S khi thut ton trch c trng MFCC thng thng.
1.1.1
Pre-emphasis
Ting ni sau khi c sha sc tin nhn (pre-emphasized) vi b
lc p ng xung hu hn (finite impulse responseFIR) bc mt v pha ca n
tuyn tnh v thc thi n gin. Dotrong ting ni, cc thnh tthp hn thng
cha ng nhiu nng lng hn, v vy n c xem xt hn khi m hnh ha so
vi cc thnh tcao hn. Do , mt b lc pre-emphasis c dng khuch
i tn hiu cc tn scao hn. Hm truyn ca b lc c cho bi phng
trnh(Error! No text of specified style in document..1),trong tham satiu
biu t 0.9 n 1. Trong min thi gian, mi quan h gia ng ra vi ng vo
c ch ra trong phng trnh (Error! No text of specified style in
document..2),visil im thica tn hiu ting ni khi cha qua blc v is
l l im thica tn hiu ting ni sau khi c pre-emphasis.
8/11/2019 Thut Ton Trch c Trng MFCC
2/29
(Error! Notext of
specified
style indocument..1)
(Error! Notext of
specified
style in
document..2)
Khi a = 0.97, l gi trc dng trong hthng nhn dng ting ni bng
phn mm, p ng tn sca blc c chra trongHnh 1.1-2,= (na
tn sly mu) bin tngln 35dB so vi = 0.
Hnh 1.1-2 p ng tn sca blc Pre-emphasis
1.1.2
Frame Blocking
Bi v tn hiu ting ni l tn hiu bin i chm theo thi gian, trong mt
h thng nhn dng ting ni th ting ni c phn on thnh nhng khong
8/11/2019 Thut Ton Trch c Trng MFCC
3/29
8/11/2019 Thut Ton Trch c Trng MFCC
4/29
(Error! Notext ofspecified
style indocument..3)
Hnh 1.1-4 Ca sHamming 160 im.
8/11/2019 Thut Ton Trch c Trng MFCC
5/29
Frame
Frame
Frame
...
160 mu
X
Hnh 1.1-5 Windowing trong phn tch ting ni.
Hnh 1.1-5 minh ha lm thno mt ca sHamming c p dng ln
tn hiu ting ni trong phn tch ting ni. Ting ni sau khi c chia thnh
nhng frame c chiu di 160 mu vi 50% chng lp, 160 im ca sHamming
c nhn vi mi frame theo tng mu. Cc frame ng ra ca sc lin tc ti
im u v im cui ca mi frame. Bc ny c thc din gii bi phng
trnh(Error! No text of specified style in document..4),trong lfn l frame
c pre-emphasis thn, ham(l)thay cho ca sHamming lwfn l frame thn
sau khi qua ca sHamming.
8/11/2019 Thut Ton Trch c Trng MFCC
6/29
(Error! Notext of
specified
style indocument..4)
1.1.4
FFT
Bin i Fourier nhanh (FFT) c dng tnh ton phca tn hiu ting
ni. chnh l sthc thi php bin i Fourier ri rc (DFT) t hiu sut cao
vi iu kin rng buc l phc nh gi ti nhng tn sri rc, nhng tn
sny c nhn vi Nfs (cc tn strc giao vi nhau), trong sf l tn s
ly mu, N l chiu di ca DFT. Thut ton FFT chyu cu khi lng tnh ton
tlvi NlogN,trong khi DFT yu cu khi lng tnh ton tlvi 2N .
phn gii tn sca DFT bgii hn bi 2 yu t: chiu di ca tn hiu
v chiu di ca DFT [14]. Nu tn hiu c to ra bi vic cng hai tn hiu sin
m tn sca hai tn hiu ny rt gn vi nhau, khi phn bit hai tn sny
chng ta phi quan st tn hiu vi phn on di. i vi chiu di ca DFT,
ph tn s c to ra bi N im DFT bao gm N/2 im vi cch u nhau
phn b gia 0 n phn na tn s ly mu. V vy tch ri hai tn s c
khong cch gn nhau th khong cch gia cc im phi nhhn khong cch
gia hai nh. Khi cc frame c ca sha vi chiu di l 160 im, chiu di
DFT c thit lp l 256 im t c phn gii tn stt vi khi lng
tnh ton c th chp nhn c khi thc thi thc t. Sau khi bin i FFT 256im, chc bin (cn bc 2) ca 128 im u tin c dng cho bc tnh
ton tip theo bi tnh cht i xng ca php bin i FFT.
8/11/2019 Thut Ton Trch c Trng MFCC
7/29
1.1.5 Mel Frequency Bank
Mt bng b lc s c dng m hnh cc tng ban u ca phn
chuyn i trong h thng thnh gic con ngi vi 2 l do sau. Thnht, v tr
ca vic dch chuyn cc i dc theo mng rung trong tai ngi kch thch tht lvi logarithm ca tn s m thanh. Thhai, cc tn sca m thanh phc
hp bn trong mt bng tn xc nh ca mt vi tn sdanh nh khng thc
nhn ra mt cch ring lc.
H thng thnh gic ca con ngi khng tuyn tnh vi tn s m thanh
nhn c, mt thang o Mel c dng nh x tn s m thanh nhn c
sang thang o tuyn tnh. Thang tn s ny c nh ngha bi phng trnh
(Error! No text of specified style in document..5) v c minh ha Hnh
1.1-6.N xp xnh thang tuyn tnh trong khong t0 n 1000Hz, xp xnh
thang logarithm bn ngoi tn s1000Hz.
8/11/2019 Thut Ton Trch c Trng MFCC
8/29
( ) (Error! Notext ofspecified
style indocument..5)
Hnh 1.1-6 Thang tn sMel
Bng thng b l
c thang o Mel thng thng trong nh
n d
ng ti
ng ni
bao gm mt s b lc bandpass hnh tam gic c phn bbn trong bng
thng tn hiu. Chng c cch u nhau trn thang Mel v bng thng ca
chng c thit ksao cho im 3dB nm khong gia hai blc kcn nhau.
Hnh 1.1-7(a) v Hnh 1.1-7(b) cho thy cc b lc ny trn thang Mel v trn
thang tn sthng thng tng ng.
8/11/2019 Thut Ton Trch c Trng MFCC
9/29
Hnh 1.1-7 Mt bng blc Mel, theo thang Mel (a) v theo thang tn sthng thng(b)
S b lc l mt trong nhng thng s m nh hng n chnh xc
nhn dng ca hthng.
H s cng sut th k ca frame thn c tnh ton bi phng trnh
(Error! No text of specified style in document..6),trong njS l im phthj
ca frame thn, v FCkjchhsthjca blc thk.
(Error! No
text ofspecified
style in
document..6)
yKl sblc.
8/11/2019 Thut Ton Trch c Trng MFCC
10/29
1.1.6 Cepstral Analysis
Tn hiu ting nisc thc m tnh l kt quca php tch chp tn
hiu kch thch vi p ng xung ca b thanh m, n c thc chia thnh 2
phn thng qua cc phng trnh di y, vigl tn hiu kch thch v vl png xung ca b m thanh [15].
(Error! Notext of
specified
style in
document..7)
(Error! Notext of
specified
style in
document..8)
(Error! No
text of
specified
style in
document..9)
Phng trnh(Error! No text of specified style in document..7) chra mi
quan h gia g v v trong min thi gian, phng trnh (Error! No text of
specified style in document..8) chra mi quan htrong min tn s. Sau khi ly
logarithm 2 v, chng ta c phng trnh(Error! No text of specified style in
document..9),vi tn hiu kch thch v p ng xung ca b thanh m c tch
ri nhau. p ng ca b thanh m quyt nh ng bao ca ph, trong khi
phca tn hiu kch thch biu din cc thnh phn ph ca ting ni. i vi
8/11/2019 Thut Ton Trch c Trng MFCC
11/29
nhn dng ting ni, ng bao ca phhu ch hn cc thnh phn ph, v vy
chng ta c thsdng php bin i Fourier ngc tm ng bao ca ph.
Cepstrum c nh ngha l php bin i Fourier ngc ca cc h s
cng sut sau khi ly logarithm. N c thc n gin ha nh l php bin i
DCT.
[
] (Error! No
text of
specified style
in
document..10)
trong pl bc (tht) ca cc hscepstral. Thng thng, i vi mi frame
0C khng dng trong phn tch bi v n khng ng tin cy. Cc hscepstrum
c bc thp phn nh thng tin b thanh m ca tn hiu ting ni. Trong php
phn tch phcho vic nhn dng ting ni, thng thng chsdng t8 n 16
hscepstrum c bc thp, trong a scc ng dng dng 12 hscepstrum.
1.1.7
Energy Calculation
Cng sut ca mi frame cng l thnh phn trong c trng MFCC. N
c tnh ton nh l logarithm ca cng sut tn hiu, c ngha l i vi frame
thn, mi frame c 160 mu 160...,,2,1, lsnl ,
8/11/2019 Thut Ton Trch c Trng MFCC
12/29
(Error! No
text of
specified style
indocument..11)
Nng lng ny c tnh ton c lp trc khi pre-emphasis v
windowing trong cc hthng nhn dng ting ni bng phn mm.
1.1.8
Delta Coefficient
Cht lng ca h thng nhn dng ting ni c thc ci thin nhiu
hn bng cch thm vo tnh o hm theo thi gian c c nhng thng sdng c bn. Trong x l tn hiu s, o hm cp 1 theo thi gian c thc
xp xbi
(Error! Notext ofspecified style
in
document..12)
(Error! Notext ofspecified style
in
document..13)
Phng trnh (Error! No text of specified style in document..12) cn
c gi l sai phn li, cn phng trnh (Error! No text of specified style in
document..13) cn c gi l sai phn tin. V vy, cc hsdelta c thc
tnh ton bng cch sdng cng thc hi quy bn di, trong dnl vect hs
8/11/2019 Thut Ton Trch c Trng MFCC
13/29
delta ca frame thn. tnh hsdelta dn, dng cc vect hsdng tCn2
n Cn+2, vi Cnl vect bao gm log nng lng v 12 hscepstral ca frame
thn
(Error! Notext ofspecified style
in
document..14)
1.1.9
Kt lun
Sau qu trnh m ttrn, mt frame 160 mu c chuyn i thnh
mt vector bao gm 26 phn t, trong gm 1 h s nng lng, 12 h s
cepstral v cc o hm bc nht theo thi gian ca chng. nframe c thto ra n
4vector c trng bi v mt hsdelta cn cc thng tin tnh ttframen . Cc vector c trng ny c sdng trong qu trnh hun luyn vnhn dng.
8/11/2019 Thut Ton Trch c Trng MFCC
14/29
1.2 Thut ton trch c trng MFCC hiu chnh cho thc hin
phn cng
Thut ton trch c trngMFCC yu cu mt lng ln cc php ton v
hu ht cng sut tnh ton tiu thtrong qu trnh bin i Fourier. Trong chng
ny, chng ti gii thiu mt thut ton trch c trng MFCC c hiu chnh.
Khi sdng phng php ngh, khi lng tnh ton c gim i mt na. Skhi ca thut ton mi c minh ha Hnh 1.2-1.Cc skhc bit chnh
gia thut ton thng thng v thut ton ci tin c nhn mnh bi cc khi
t nt.
Pre-emphasis Sub-Frame Windowing |FFT|
Mel
Frequency
Filter bank
Cepstrum
Logged
energyDelta
Speech
samples
MFCC
Overlap
Hnh 1.2-1 S khi thut ton trch c trng MFCC hiu chnh.
8/11/2019 Thut Ton Trch c Trng MFCC
15/29
1.2.1 Pre-emphasis
Blc pre-emphasis cng tng tnh blc c dng trong thut ton
thng thng c trnh by trn. Trong phn mm nhn dng ting ni hsa
c thit lp bng 0.97, nhng thun tin cho thc thi phn cng, chng ti sdng 3231a . THnh 1.2-2 chng ta c ththy rng chc mt sai khc nh
vp ng tn sca 2 b lc, iu ny c chra bi cc kt quth nghim
c trnh by phn sau.
Thun li ca vic dng hsblc 3231a c gii thch bi phng
trnh (Error! No text of specified style in document..15). Trong h thng tnh
ton snhphn th 1321 is c tnh bng cch dch phi 1is i 5 bit. Bng cch
sdng tnh cht ny, php nhn chn gin l php ton dch v tr, v vy c
thi gian tnh ton ln din tch chip c gim bt.
8/11/2019 Thut Ton Trch c Trng MFCC
16/29
11
1
1
321
32
31
32
31,
iii
ii
iii
sss
ss
aasss
(Error! No
text of
specified style
indocument..15)
Hnh 1.2-2 p ng tn sca blc pre-emphasis
1.2.2 Sub-Frame Blocking
Trong bc ny gii thiu mt thut ng mi tn l sub-frame. Mtframe thng thng bao gm mt vi sub-frame m khng c chng lp gia cc
sub-frame k cn nhau, chng lp gia cc frame thng thng c thxem nh
vic dng li ca cng sub-frame. Nh tho lun trong phn1.1.2,ting ni sau
khi c pre-emphasis c chia ra thnh nhng frame c chiu di 20ms vi
50% chng lp, chng ti xut chia tn hiu ting ni thnh nhng sub-frame c
chiu di 10ms, v vy 2 sub-frame k tip nhau to thnh mt frame thng
thng. Vi tn sly mu l 8KHz, chiu di ca mi sub-frame by gichcn
80 im ( 8001.08000 ).
THnh 1.2-3,chng ta c ththy rng c3 sub-frame to thnh 2 frame,
v vy nsub-frame sbao gm n-1 frame.
8/11/2019 Thut Ton Trch c Trng MFCC
17/29
Frame fn
Frame fn+1
Frame fn+2
Sub-frame
sfn ...
...
160 mu
80mu
(a)
(b)Sub-framesfn+1
Sub-frame
sfn+2
Sub-frame
sfn+3
Hnh 1.2-3 Sub-Frame v Frame
1.2.3 Windowing
Khi tn hiu ting ni c chia thnh nhng sub-frame c chiu di 80
im, trong gii thut mi ny, mt ca sHamming 80 im c p dng cho
mi sub-frame gim i hiu ng bin ca mi phn on. Nh c cp n
trong phn1.1.3,bp sng chnh v cc bp sng phca hm ca snh hng
n vic phn tch phca cc tn hiu ting ni. Khi kch thc ca sgim i
mt na, brng bp sng chnh stng ln gp i tng ng.Hnh 1.2-4 chra
p ng tn sca ca sHamming 160 im v 80 im.Hnh 1.2-5 minh ha
phng thc to ca sca gii thut c ngh.
8/11/2019 Thut Ton Trch c Trng MFCC
18/29
Hnh 1.2-4 p ng tn sca cc ca sHamming khc nhau.
Hnh 1.2-5 Phng thc to ca sca gii thut hiu chnh.
1.2.4
FFT
Trong bc ny strnh by tnh phca mi sub-frame.Hnh 1.2-6 chra
mt sub-frame 80 im c ca s ha v ph ca n c chuyn i thnh
FFT 256 im v 128 im tng ng. phn gii tn sca mt phFFT
256 im chtt hn mt t so vi mt phFFT 128 im, nhng n yu cu
8/11/2019 Thut Ton Trch c Trng MFCC
19/29
khi lng tnh ton gn nh l gp i bin i FFT 128 im v khi lng tnh
ton tlvi NNlog vi N l sim FFT.
22
log2
log
NN
NN
(Error! No
text of
specified style
in
document..16)
Hnh 1.2-6 Phca tn hiu tcc FFT khc nhau
8/11/2019 Thut Ton Trch c Trng MFCC
20/29
Chng ta sdng FFT 128 im trong gii thut mi bi v nhng l do
trn. Hn thna, do tnh cht i xng ca mt phnn chc 64 im u
tin c dng cho cc bc tnh ton tip theo.
Sau khi bin i FFT, bin ca 64 im phc u tin c tnh ton
bng thut ton c lng, thut ton ny tnh ton rt nhanh bin ca mt s
phc gn nh chnh xc so vi cch tnh bin bng cch ly cn bc 2. Cho s
phc I+jQ, thut ton c lng bin nh sau:
QIQIM ,min,max (Error! No
text of
specified stylein
document..17)
Php tnh gi trtuyt i gii hn phm vi sphc trong tm t0 n 090 ,
sau cc php tnh max, min sgii hnsphc trong tm t0 n 045 . Trong
gii hn ca tm ny, skt hp tuyn tnh ca I v Q t c xp xtt vbin
. Trong hthng ny, chng ti sdng 1 v 41 . Php tnh xp xny
gim bt khi lng tnh ton vi sai sc thchp nhn c.
1.2.5
Mel Frequency Filter Bank
Chng ta sdng cng mt phng php nh tho lun trong phn1.1.5,
nhng thay v sdng blc hnh tam gic, trong gii thut mi ny chng ti s
dng b lc hnh chnht. Theo nh nhng nghin cu trc y, mt bng b
lc c thdng cho nhn dng ting ni nu p ng tn skhi kt hp cc blc
thnh phn ca n l phng trn ton bdy tn smong mun. V vy bng b
lc hnh ch nht tha yu cu ny tt hn bng b lc hnh tam gic thng
thng.Hnh 1.2-7, Hnh 1.2-8 minh ha p ng xung ca bng blc hnh tam
gic v hnh chnht tng ng.
8/11/2019 Thut Ton Trch c Trng MFCC
21/29
Hnh 1.2-7 Bng blc hnh tam gic thng thng (a) v p ng tn skhi kt hp ccblc vi nhau (b).
Hnh 1.2-8 Bng blc hnh chnht ngh(a) v p ng tn skhi kt hp cc blc vi nhau (b).
8/11/2019 Thut Ton Trch c Trng MFCC
22/29
(Bin ca tt ccc blc hnh chnht u bng 1 nhng dquan
st trong hnh trn ta vbin cc blc khc 1)
Trong phng php thng thng, ng ra FFT c nhn vi cc hsca
cc blc hnh tam gic to ra cc gi tr ng ra blc. Sau , tt ccc gi tr
ng ra ca mt b lc c ly tng to ra h s cng sut ca b lc . Tuy
nhin, nu chng ta sdng cc blc hnh chnht thay thcc blc hnh
tam gic khi php ton nhn v cng n gin chl cc php ton cng v
khng cng bi v h sng ra ca b lc hnh chnht hoc l 1 hoc l 0.
Trong thut ton mi th khng yu cu php ton nhn trong bc ny.
1.2.6
OverlappingQu trnh chng lp trong gii thut mi c minh ha Hnh 1.2-9.
Trong nf v 1nf i din cho cc frame thng thng (chiu di mi frame l
160 im) thnv thn+1 vi 50% chng lp,n
sf v 1nsf i din cho cc sub-
frame trong gii thut mi (chiu di mi sub-frame l 80 im) thnv thn+1
tng ng.nk
FS l hscng sut c to ra bi blc thkca sub-frame th
n. Cc ng rank
FS v knFS 1 ca bng blc c cng vi nhau to ra hs
cng sut nkS , h s cng sut knS 1 c c bng cch ly tng cc ng ra
knFS 1 v knFS 2 ca b lc. iu c ngha l nkS bng vi nkFS cng vi
knFS 1 v knS 1 bng vi knFS 1( cng vi knFS 2 . Bng cch cng ng ra hin
ti vi ng ra trc ca bng blc, chng ta lp li 50% chng lp ging nh
trong thut ton thng thng.
8/11/2019 Thut Ton Trch c Trng MFCC
23/29
Hnh 1.2-9 Qu trnh chng lp trong thut ton ngh
Mt frame thng thng nf bao gm hai sub-frame l nsf v 1nsf , nng
lng ca n bng vi tng nng lng ca hai sub-frame thnh phn, hoc trong
ton bbng thng (t0 n 4 KHz) hoc trong mt khong bng thng c bit
no (blc thktrong bng blc). V vy, trongHnh 1.2-9, nkS chra hs
cng sut thkca frame thng thng nf v knS 1 chra hscng sut thk
ca frame thng thng 1nf .
1.2.7
Cepstral Analysis
Tnh ton cc h s cepstrum cng tng t nh trong gii thut thng
thng
8/11/2019 Thut Ton Trch c Trng MFCC
24/29
PpK
pkSC
K
nknp...,,2,1,5.0coslog
1k
(Error! No
text of
specified style
indocument..18)
Php ton logarithm th tnh ton rt kh trong cphn mm ln phn cng.
Bi v gi tr nkS khng thbit trc v chng ta cn tnh logarithm nhanh v gn
ng, v vy n gin chng ti tnh xp xlogarithm bi thut ton Mitchell
[19]. Gi N l sc cho bi:
i
k
i
izN
0
2 (Error! No
text of
specified style
in
document..19)
vi iz bng 0 hoc 1. Gis kz bng 1 ktbit c trng scao nht (MSB) ca z.
Khi N c vit li nh sau:
1
22k
oii
ik zN (Error! No
text of
specified style
in
document..20)
Bng cch t hsk
2 lm nhn tchung, ta c:
8/11/2019 Thut Ton Trch c Trng MFCC
25/29
i
k
i
kik zN1
0
212 (Error! No
text of
specified style
indocument..21)
Ly 2log hai v, ta c:
i
k
i
ki zkN1
022
21loglog (Error! No
text of
specified style
in
document..22)
V ki nni
k
i
ki z
1
0
2 sthuc khong t0 n 1, chng ta k hiu gi tr
ca tng ny bng m. Khi phng trnh(Error! No text of specified style in
document..22) c vit li nh sau:
1,1loglog 22 mmkN (Error! Notext of
specified style
in
document..23)
Mitchell xp xgi tr thc ca m1log 2 bi phng trnh ng thng
bam . n gin, Mitchell sdng 1a v 0b trong phng trnh ng
thng xp x
8/11/2019 Thut Ton Trch c Trng MFCC
26/29
mkN 2log (Error! No
text of
specified style
indocument..24)
V d, chng ta tnh gn ng 39log 2 , c ngha l N=39. Biu din N di
dng s nh phn ta c N=100111. Trong trng hp ny, 5k v
200111.0m . Chng ta vit li N di dng 3900111.012 25 N , bng
cch sdng phng trnh (Error! No text of specified style in document..24)
tnh xp x 222 00111.10100111.0539log , gi tr xp x ny bng5.21875, gi trthc sca 2854.539log 2 .
Sai sca php ton gn ng l:
mmerror 1log2 (Error! No
text of
specified style
in
document..25)
Khi sai s ln nht bng 0.086071 vi m=0.442695 bng cch ta tnh
o hm cp 1 theo mnh sau:
12ln
1
12ln1
1
0
1log2
m
m
mmrerro
(Error! No
text of
specified style
in
document..26)
8/11/2019 Thut Ton Trch c Trng MFCC
27/29
1.2.8 Energy Calculation
Nng lng ca mi frame c tnh l logarithm ca tng nng lng 2
sub-frame ktip nhau (mi sub-frame c chiu di l 80 im). N c tnh bi
phng trnh(Error! No text of specified style in document..27),trong nlsf chnh l mu thltrong sub-frame thn.
80
1
80
1
2
1
2loglogl l
lnnln ssE
(Error! No
text of
specified style
in
document..27)
Trong thut ton mi ny th nng lng cng c tnh c lp trc khi
pre-emphasis v windowing nh trong thut ton thng thng. Php ton tnh
logarithm c tnh gn ng nh tho lun trong phn1.1.7
1.2.9
Delta Coefficient
Trong php ton tnh h s delta, chng ti nhn biu thc ca phng
trnh(Error! No text of specified style in document..14) vi 10. Tt ccc hsdelta u c nhn vi 10, iu ny skhng nh hng n cht lng nhn
dng bi v hm phn phi khng i. Vic nhn cc hsdelta vi 10 slm cho
vic thc thi phn cng trnn n gin hn nhiu v khng c php chia trong
tnh ton.
8/11/2019 Thut Ton Trch c Trng MFCC
28/29
1122
1122
2
1010
2
nnnn
nnnn
n
cccc
ccccd
(Error! No
text of
specified style
indocument..28)
1.2.10
Nhng thun li ca gii thut hiu chnh
Phn cng thc thi bnhn tiu tn nhiu din tch v cng sut tiu th
hn cc phn cng thc thi cc php ton khc. V vy tit kim ngun ti
nguyn phn cng, chng ta cn gim bt slng php ton nhn ti u ha
thut ton. Trong thut ton trch c trng MFCC thng thng c trnh bytrong phn1.1,vi mt frame c chiu di 160 im cn phi tnh 160 php ton
nhn trong bc windowing, 256log128 2 php ton nhn trong vic tnh FFT,
khong 256 php nhn trong tnh ton cng sut v 1227 php ton nhn trong
bc tnh DCT. Tng cng cn 1764 php nhn cho trch c trng mt frame.
Bng cch sdng thut ton trch c trng mi c nghtrong phn
1.2,tng sphp nhn dng cho trch c trng mt frame chcn 804, trong
bao gm 80 php ton nhn trong bc windowing, 128log64 2 php ton nhn
trong vic tnh FFT v 1223 php ton nhn trong bc tnh DCT.
Cc c trng c trch bi phng php mi cha ng nhng thng tin
tng tnh phng php thng thng bi v chng cng c tnh ton tcc
hscng sut ca frame l ng ra ca bng blc tn sthang Mel ging nh
nhng c trng thng thng. Nhng slng php ton nhn trong thut ton
mi c gim i mt na, phn ln php nhn c gim bt trong bc tnh
FFT. Hn na, php nhn trong tnh ton FFT th phc tp hn 4 ln php nhn
dng trong cc khi khc, bi v tnh FFT yu cu php nhn sphc. Kt qul
khi thc thi phn cng th thut ton mi sdng t din tch v tiu thcng sut
8/11/2019 Thut Ton Trch c Trng MFCC
29/29
t hn mc d chnh xc ca kt qunhn dng chthp hn mt t so vi thut
ton thng thng.
1.2.11Kt lun
Bng cch sdng thut ton tnh trch c trng MFCC mi, mt frame
bao gm 2 sub-frame c chuyn thnh vect MFCC 26 phn t, gm 1 h s
nng lng, 12 hscepstral v 13 hsdelta (o hm cp 1 ca hscepstral
theo thi gian). Vi n sub-frame, ta c n-1 frame thng thng v svect c
trng l n-5.