23
TRƯỜNG ĐẠI HỌC BÁCH KHOA HÀ NỘI VIỆN CÔNG NGHỆ THÔNG TIN VÀ TRUYỀN THÔNG ---------- BÁO CÁO QUÁ TRÌNH ĐỒ ÁN TỐT NGHIỆP Đề tài : Xây dựng ứng dụng tổng hợp tiếng nói “Tiếng Việt”trên hệ điều hành Android Giảng viên hướng dẫn : Trịnh Văn Loan Sinh viên thực hiện : Phạm Bắc Anh Lớp : KTMT-TT1 – K55 SHSV : 20101109

Báo cáo đồ án

Embed Size (px)

DESCRIPTION

bao cao

Citation preview

TRNG I HC BCH KHOA H NIVIN CNG NGH THNG TIN V TRUYN THNG----------

BO CO QU TRNH N TT NGHIP ti : Xy dng ng dng tng hp ting ni Ting Vittrn h iu hnh Android

Ging vin hng dn : Trnh Vn Loan Sinh vin thc hin : Phm Bc Anh Lp : KTMT-TT1 K55 SHSV : 20101109

H NI 3/2015

Mc LcChng 1: Tng quan v tng hp ting ni21.1Gii thiu21.2 ngha ca TTS21.3Qu trnh pht trin ca TTS trn th gii31.4TTS Vit Nam3Chng 2: Phng php tng hp ting ni42.1 m hnh chung ca mt h thng TTS42.1.1 Phn tch vn bn42.1.2 Phn tch cch c52.1.3 To ra sng m thanh62.2 Phng php tng hp Formant62.3 Phng php tng hp bng ghp ni82.3.1 Phng php tng hp bng ghp ni cc ting82.3.2 Phng php tng hp bng ghp ni cc phone82.3.3 Phng php tng hp bng ghp ni diphone92.4 Phng php m phng pht m122.5 c im ca ting vit132.6 Kt lun14Chng 3: Xy dng phn mm153.1 Cc cng vic phi thc hin xy dng chng trnh15Chng 4:Tng kt15

Chng 1: Tng quan v tng hp ting ni

1.1 Gii thiuTng hp ting ni l vic to ra ting ni ca con ngi mt cch nhn to. Mt h thng my tnh thc hin mc ch ny c gi l mt h thng tng hp ting ni. Tng hp ting ni c th c thc hin bng bng phn mm hay nhng vo phn cng ca my tnh.Vic tng hp ting ni c th c thc hin bng nhiu phng php. Phng php ph bin nht hin nay l phng php tng hp bng cch ghp ni cc on ting ni nh hn c lu tr trong c s d liu. Vic lu tr c s d liu nhiu hay t lm nh hng rt ln n kt qu thu c tt hay khng. i khi, v mc ch cn phi gim ln ca c s d liu, ngi ta chp nhn lm gim cht lng ca ting ni thu c trong mt mc cho php.Cht lng ca mt h thng tng hp ting ni c nh gi da trn ging i vi ting ni ca ngi tht v kh nng ngi nghe c th hiu c ht ngha ca vn bn.Mt h thng chuyn vn bn thnh ting ni (ting Anh l Text To Speech, trong kha lun ny s c vit tt l TTS) l mt h thng c u vo l mt vn bn v u ra l mt sng m thanh.

1.2 ngha ca TTSBi ton ny c rt nhiu ngha thc tin: Gip ngi tn tt: y l ng dng c ngha nht ca TTS. Trc y, ngi ta tng c cc loi bng ghi m cc cun truyn hay sch dnh cho ngi tn tt. Tuy nhin s lng nhng loi sch, truyn ny khng nhiu v cng vic c hin mt cch th cng v tn rt nhiu thi gian. Vi s gip ca cc h thng TTS, cng vic c lm t ng v cho hiu qu rt cao. Cc thit b truyn thng a phng tin: Vi s pht trin v thnh cng vt bc ca TTS cho ting Anh, cc phn mm hc ting Anh hay cc t in in t cng s dng cc h thng TTS. Ngoi ra, cc tr chi in t hin ny ng dng cng ngh ny rt rng ri. Trong truyn thng: Mt trong nhng nguyn nhn gy ra nhiu tai nn xe hi l ti x va li xe va c tin nhn. S vi s h tr ca TTS, ngi li xe hon ton c th tp trung vo vic li xe m vn nghe c tin nhn mnh nhn c. Ngoi ra, khi cng ngh GPRS cha pht trin th vic check email khi phi i cng tc ti mt khu vc cha pht trin gp rt nhiu kh khn. Khi ngi ta c nhng phn mm s dng TTS check email qua in thoi di ng Hin nay, Vit Nam vic s dng cc h thng TTS vn cn cha nhiu. Ch yu l vic c cc thng bo ti cc nh ga, sn bay hay ti c quan nh nc c h thng xp hng

1.3 Qu trnh pht trin ca TTS trn th giiTing ni nhn to c nghin cu trong mt thi gian di v c s tham gia ca nhiu nh khoa hc. Nhng ngi u tin c tng v mt chic my c kh nng ni c l Gerbert of Aurillac, Albertus Magnus (1198 1280) v Roger Bacon (1214 1294).Tuy vy phi ti nm 1779, nh khoa hc ngi an Mch Christian Kratzenstein mi xy dng thnh cng m hnh c kh tng hp 5 m /a/, /e/, /i/, /o/, /u/. Thit b ny vn cha tng hp c mt cu ni hon chnh.Thit b u tin c xem nh mt b tng hp ting ni l VODER (Voice Operating Demonstrator) c nh khoa hc ngi M Homer Dulley gii thiu nm 1939 ti New York. H thng ny c th tng hp cc cu n gin nhng cn s iu khin ht sc phc tp.Trong my thp k qua, cc h thng TTS c nhng bc pht trin vt bc. Cht lng ca nhng h thng TTS c pht trin ngy cng cao v c ng dng vi cc mc ch thng mi. a s cc h thng ny dnh cho ting Anh. Ngoi ra, cng c mt s cc ngn ng khc nh ting Trung, ting Ty Ban Nha, nhng ting Anh vn c nghin cu nhiu nht v vy ting Anh c h thng TTS chun mc hn c.

1.4 TTS Vit NamTTS Vit Nam cng c nghin cu t kh lu. Hin nay c 2 chng trnh thnh cng hn c l VnSpeech v VietSound.Phn mm VnSpeech l h thng tng hp ting ni u tin ca Ting Vit, phn mm ny s dng phng php tng hp Formant. H thng ny c th c c hu ht cc m tit ting Vit mc nghe r tuy vy, mc t nhin khng cao.Phn mm VietSound l phn mm c pht trin ti i hc Bch Khoa Thnh ph H Ch Minh. Phn mm ny s dng gii thut TD-PSOLA dng tng hp cc nguyn m n v phng php tng hp FORMANT tng hp cc ph m, nguyn m v m vn n gin. Phn mm ny cng cha t n mc t nhin gn ging vi ting ni con ngi.C hai phn mm trn u c nhc im l m thanh thu c ri rc, thiu t nhin.Chng 2: Phng php tng hp ting ni

2.1 m hnh chung ca mt h thng TTSThng thng h thng TTS gm 3 bc: Phn tch vn bn Phn tch cch c To ra sng m thanh

Hnh 1 M hnh mt h thng tng hp ting ni

2.1.1 Phn tch vn bnPhn tch vn bn l vic chuyn cc k hiu, cc s, cc ch vit tt ra thnh cc cu ch y . V d nh cu Phong tro sinh vin tnh nguyn do TW on TNCS H Ch Minh pht ng c hng ng ca hn 10000 sinh vin trn c nc cn phi c chuyn thnh Phong tro sinh vin tnh nguyn do Trung ng on Thanh nin Cng sn H Ch Minh pht ng c hng ng ca hn mi nghn sinh vin trn c nc. thc hin vic phn tch vn bn tt, ta cn c: Mt module chuyn s thnh dng ch vit. Mt c s d liu cc ch vit tt thng dng. Mt c s d liu cc khun dng thng dng nh ngy thng, gi, ...Tuy vy, ta cng s gp phi nhiu kh khn do nhng tnh hung nhp nhng. V d nh cm 1/2 c th c hiu l ngy mng mt thng hai hoc mt phn hai. Mt v d khc l khi ta gp mt dy s 38533580, ta cn phi xc nh xem y l s m (ba mi tm triu nm trm ba mi ba ngn nm trm tm mi) hay y l s in thoi (ba tm nm ba ba nm tm khng). Nhng trng hp nh th ny i hi ta phi xc nh c vn cnh ca vn bn u vo.2.1.2 Phn tch cch cVic phn tch cch c thc cht chnh l qu trnh tin x l cho vic tng hp ting ni. V vy, qu trnh ny cn ph thuc vo vic chng ta s s dng phng php no thc hin vic tng hp ting ni.Cng phi ni thm rng ting Vit c mt thun li rt ln l mi cch vit ch c mt cch c khng nh ting Anh mt cch vit c th c nhiu cch c ph thuc vo ng cnh.Nu vic tng hp ting ni c thc hin bng cc phng php ghp ni, th vic bt buc i vi chng ta l phi phn chia cu cn tng hp thnh cc n v c sn trong c s d liu ca chng ta. Hy xt v d ta cn tng hp cu Xin cho bng phng php ghp ni diphone. Cc diphone c trong c s d liu l m cm x, x i, i n, n m cm, m cm ch, ch , o. Khi ta cn tch on text Xin cho thnh m cm x i n m cm ch o m cm. ting ni tng hp c thu c cht lng tt th phn tch ngn iu l v cng quan trng. Ngn iu gm: cao thp, di ngn, cng . cao thp (pitch) hay tn s trn mt cu ph thuc vo nhiu yu t trong c loi cu (cu k, cu hi, cu cm thn); ngi ni (gii tnh, trng thi cm xc). V d cu k thng thp ging cui cu cn cu hi li cao ging cui cu. Ngi ni l nam thng ni vi cao thp hn. di ngn (duration) l c im v thi gian pht m mt t hay mt m v. Thng thng hai ting lin tip m to thnh t khong ngh gia hai ting s ngn hn hai ting lin tip nhng khng to thnh mt t. i khi, di ngn cng c th hin khi ngi ni mun nhn mnh mt t no trong cu.Cng th hin to nh ca ting ni. mc m tit, cc nguyn m thng c cng mnh hn ph m. mc cm, cc m tit phn cui ca cch pht m c th c cng yu hn.Mt h thng TTS cn phn tch c cch c mc cng gn vi thc t cng tt. y l mc tiu ca mi h thng TTS cho cc ngn ng khc nhau, tuy vy cha c mt h thng no c th thc hin hon ho iu ny.2.1.3 To ra sng m thanhy l qu trnh trc tip to ra tn hiu m thanh. Cht lng ting ni tng hp c ph thuc rt nhiu vo phn ny. Trn th gii c rt nhiu phng php c a ra tng hp ting ni nh phng php tng hp Formant, phng php ghp ni Diphone, ...Cc phng php c th c chia lm 4 nhm: Phng php tng hp da trn h lut: phng php Formant Phng php tng hp bng ghp ni: Phng php tng hp bng ghp ni phones Phng php tng hp bng ghp ni na phones Phng php tng hp bng ghp ni diphone Phng php tng hp da trn cc m hnh: Phng php tng hp da trn m hnh Markov n (HMM) Phng php tng hp da trn m hnh m ting ni v nhiu (Harmonic plus Noise HNM) Phng php tng hp da trn m phng pht m2.2 Phng php tng hp FormantPhng php ny cn c tn gi khc l phng php tng hp da trn h lut (rule-based). y l phng php khng da vo nhng on ting ni thu sn ca con ngi. Phng php tng hp Formant s s dng ting ni tng hp c to ra da trn c s l thuyt m hc ca qu trnh to ting ni. Ph bin nht hin nay chnh l m hnh ngun m b lc ( source-filter model) to ra c tn hiu ting ni.Formant l mt s cng hng m thanh. y chng ta c th hiu tn hiu ting ni l kt qu ca ngun kch hu thanh hoc v thanh c cng hng hay phn cng hng ca tuyn m, sau nh hng bi s tn x ca ting ni qua mi v mi.Phng php tng hp Formant u tin c Walter Lawrence a ra vo nm 1953, phng php ny s dng 3 formant c ni song song. 3 formant cng l s lng formant ti thiu c th to ra c mt ting ni nghe c. Di y l mt m hnh 3 formant c ni ni tip:

Hnh 2 M hnh 3 formant ni tipu vo ca m hnh ny l 12 tham s: tn s chung (F0), 3 tn s ca cc Formant v 3 bin ca cc Formant, cng ca tn s thp, cng ca tn s cao, Do c nhiu tham s nn vic iu khin l rt phc tp. Tuy vy, phng php ny mi ch a ra c mt ting ni vi cht lng nghe c. Ting ni vn cn ri rc, khng trn chu, lin mch hay ni cch khc l khng t nhin.Nm 1980, Dennis Klatt a ra mt m hnh phc tp gm 5 formant v cn ti 39 tham s iu khin v c cp nht 5 mili giy mt ln. Di y l m hnh c Klatt a ra:

Hnh 3 M hnh 5 formant ca KlattCho n nay, y vn l m hnh tt nht cho phng php ny. Phng php formant c nhng c im ni tri so vi cc phng php khc l khng cn phi lu tr c s d liu, thi gian tng hp l rt nhanh. y l phng php hin nay rt thch hp cho nhng ng dng trn cc thit b nh PDA, PC Pocket v c im ca nhng thit b ny l phn cng yu.Tuy nhin, phng php ny v mt cht lng ca ting ni vn khng tt. Ting ni vn b ri rc, khng t nhin. Hn na, phng php ny rt kh xy dng. Ta cn phi c mt s hiu bit su sc v mt m hc mi c th thc hin c phng php ny.2.3 Phng php tng hp bng ghp niTrong cc phng php ny, ting ni s c tng hp t cc on ting ni nh hn c lu tr sn trong c s d liu. i vi ting Vit, c th l: phone, diphone, ting, 2.3.1 Phng php tng hp bng ghp ni cc tingR rng y l mt phng n khng kh thi, i vi ting Vit, s lng ting l rt ln, iu ny lm cho c s d liu phi lu tr l rt ln. Mt nhc im na ca cch lm ny l gia cc ting s khng c trn, do cc ting c thu ring bit ti cc thi im khc nhau.2.3.2 Phng php tng hp bng ghp ni cc phonePhone: L m v hay chnh l n v m nh nht to ra ting ni. Thng thng i vi mi ngn ng th mi ch ci trong bng ch ci l mt phone. Trong ting Vit, ngoi cc phone l ch ci ra cn c cc phone l t hp ca cc ch ci nh: th; gh; kh; gi; nh; ng; ngh m cm (silence) c th coi l mt phone c bit. Thng thng mi m v c mt cch c ring, tuy nhin khng phi mi cch c ch tng ng vi mt m v. Trong ting Vit, mt s m v c cch c ging nhau tuy cch vit khc nhau (chng hn: ng v ngh, i v y, g v gh ).S lng phone ca ting Vit c du l 95 phone v c s d liu ca ta ch cn bao gm 95 phone ny. C th thy ngay c s d liu nh l mt u im ln ca phng php ny.Tuy vy, cht lng ting ni tng hp ca phng php ny li khng cao. Hy xt v d cn tng hp cu Hai bn ht hay. T hai c ghp t cc phone h, a, i. T hay c ghp t cc phone h, a, y. R rng trong c s d liu cch c cc phone, h, a, i, y th phone i v phone y c cch c ging ht nhau nhng khi ghp vi cc phone khc li cho ta hai cch c hon ton khc nhau. Cch tng hp ny khng t c yu cu u tin ca mt h thng tng hp ting ni l ting ni sinh ra phi hiu c, n lm thay i hon ton ngha ca vn bn.2.3.3 Phng php tng hp bng ghp ni diphoney l phng php c pht trin t nhng nm 70 ca th k trc. Cho ti nay, phng php ny l mt trong nhng phng php hiu qu nht v c ng dng rng ri cho nhiu ngn ng.Diphone: Mt diphone c bt u t im gia ca phone trc n im gia ca phone sau trong hai phone ng cnh nhau ca mt cp phone. Vi mt t c th c mt, hai hoc nhiu diphone. V d nh t ba ch c mt diphone l b a nhng t ban c hai diphone l b a v a n. Cc t ch c mt phone c coi l mt diphone ca phone vi m cm, chng hn t a c coi l mt diphone ca asilence.Phng php tng hp diphone c thc hin theo 4 bc: Lit k tt c cc phone v cc c tnh ca cc phone ny Lit k tt c cc cch ghp ni cc cp phone-phone to thnh diphone. Do c nhng cp phone-phone khng xut hin nn s lng diphone khng bao gi qu bnh phng s lng phone. Xy dng c s d liu cch c cho cc diphone ny Ghp ni cc diphone: y chnh l qu trnh quan trng nht ca phng php ny. y, thut ton thng dng nht l ng b im pitch.2.3.3.1 in pitchim pitch (pitch mark) l im c tn s l cc i a phng trn mt sng m. Di y l hnh nh v d v im pitch trong phone /u/.

Hnh 4 im pitch trong phoneGia hai diphone gn ghp ni vi nhau, bao gi ta cng c mt phone ging nhau v d nh a b v b c. Vic chng ta cn lm l chnh sa sng m ca phone b diphone th nht hoc diphone th hai hoc c hai diphone sau cho chng c th chng kht c ln nhau.

Hnh 5 Ghp ni hai diphoneVic ghp ni hai diphone c th c thc hin bng thut ton PSOLA (pitch synchronous overlap add) c gi l ng b im pitch.2.3.3.2 ng b im pitch theo min thi gian TD-PSOLANgi ta pht trin nhiu phin bn ca thut ton PSOLA nh: TD-PSOLA (time domain pitch synchronous overlap add), MBROLA (multi band overlap add), LP-PSOLA (linear pitch synchronous overlap add). Tt c cc phin bn ny u c mt tng chung l sa i trc tip sng m thanh m khng s dng cc thng s no ca n.Thut ton TD-PSOLA c hng truyn thng Php pht trin vo u nhng nm 1990 v da trn tng: Nu x(n) tun hon trong khong [, +] th ta c th to ra mt sng mi s(n) t x(n) vi cc im pitch c dch chuyn t T0 v T m ta mong mun. Thut ton c th hin bi cng thc bin i:si (n) = x(n)w(n - iT0 )

y w(x) l mt ca s c chn. C th hiu rng, s(P) mi ph thuc vo cc im nm trong ca s ca x(P0) c m P0 v P l hai im tng ng ca sng c v sng mi. Cch lm ny ngi ta chng minh c l bin sng khng b thay i trong qu trnh bin i sng.

Hnh 6 Thut ton TD-PSOLATrong hnh trn, tn hiu pha bn tri c gin ra khp vi cc im pitch mong mun. Cc hnh bn phi l bin ng s tng ng v ta c th nhn thy l bin ny khng b thay i.2.4 Phng php m phng pht mTng hp m phng pht m l cc k thut tng hp ging ni da trn m hnh my tnh ca c quan pht m ca ngi v qu trnh pht m xy ra ti . H thng tng hp m phng pht m u tin l ASY c pht trin phng th nghim Haskins vo gia nhng nm 1970 bi Philip Rubin, Tom Baer, v Paul Mermelstein. Tng hp m phng pht m tng ch l h thng dnh cho nghin cu khoa hc cho mi n nhng nm gn y. L do l rt t m hnh to ra m thanh cht lng cao hoc c th chy hiu qu trn cc ng dng thng mi. Mt ngoi l l h thng da trn NeXT; vn c pht trin v thng mi ha bi Trillium Sound Research Inc, Canada. H thng to ra mt my tng hp ging ni da trn m phng pht m hon chnh, da trn m hnh ng dn sng tng ng vi c quan pht m ca ngi. N c iu khin bi M hnh Phn Ring bit ca Carr; bn thn m hnh ny li da trn cng trnh ca Gunnar Fant v cc ngi khc Phng th nghim Cng ngh Ging ni Stockholm thuc Vin Cng ngh Hong gia Thy in v tng hp ging ni cng hng tn s. Cng trnh ny cho thy cc cng hng tn s trong ng cng hng c th c iu khin bng cch thay i tm tham s tng ng vi cc cch pht m t nhin ca c quan pht m ca ngi. H thng bao gm mt t in pht m cng vi cc quy tc pht m ty thuc ng cnh gip ghp ni m iu v to ra cc tham s pht m; m phng theo nhp iu v ng iu thu c t cc kt qu nghin cu ng m hc. c th thc hin c phng php ny i hi thi gian, chi ph v cng ngh. Phng php ny kh c th ng dng ti Vit Nam trong thi im hin nay.2.5 c im ca ting viti vi bi ton TTS, ting Vit so vi cc ngn ng khc c rt nhiu thun li. Mi cch vit ch c duy nht mt cch c. Tuy nhin, mt trong nhng kh khn ln nht ca ting Vit chnh l vn thanh iu. Vic mi nguyn m c 6 thanh (ngang, sc, huyn, ng, hi, nng) li lm cho vic tng hp gp nhng kh khn khc. Vic c khng du chng ta hu ht c th hiu c nhng nh vy vn c th gy ra nhng hiu lm. Tuy nhin, nu ta sinh c sng m cho ting Vit khng du, th ta c th bin i sng m thu c sng m th hin ting Vit c du.Mt m tit ting Vit khi c c 5 loi m thanh: m u (ph m), m trung bnh (bn nguyn m), m trung tm (nguyn m hoc nguyn m i), m cui (nguyn m hoc bn nguyn m) v thanh iu (du). Khi thay cc thanh iu vo cng t, gi tr F0 thay i nh sau: Vi thanh ngang, gi tr F0 bt u ln nht v duy tr cho ti khi kt thc m tit. Thanh huyn gi tr F0 bt u thp hn thanh ngang, thanh sc v thanh ng. Thanh ng gi tr F0 bt u cao, ti gia m tit th gim xung, v tng ln cao nht khi n cui m tit. Trong hu ht cc trng hp, cc m tit c thanh ng c gi tr F0 cc tiu ri vo khong gia n 2/3 ca F0 ti thi im ban u. Thanh hi gi tr F0 gim dn n khong 2/3 gi tr F0 ban u ri tng tr li. Thanh sc gi tr F0 gi n nh trong khong 2/3 thi gian ca m tit ri sau tng nhanh. Thanh nng gi tr F0 gim nhanh v thi gian ko di thng ch bng 2/3 thi gian cc thanh khc.

Di y l th m t s bin thin ca gi tr F0 cc thanh v v d vi vic ghi m ch chi cng vi 6 thanh ln lt l ngang, huyn, ng, hi, sc, nng.

Hnh 7 S thay i gi tr F0 ca cc thanh

Hnh 8 S thay i gi tr F0 khi cc thanh i vi ch "chi"2.6 Kt lun Vi mc ch xy dng c mt h thng tng hp ting ni m m thanh thu c phi c trn, t nhin cao nht, phng php ghp ni bng diphone l vt tri hn c. Yu cu v trn v t nhin ca ting ni chnh l im mnh ca phng php ny. Cc phng php khc u khng so snh c vi phng php ghp ni diphone tiu ch ny. Hn na, phng php ny li c u im l vic xy dng khng phc tp. V vy, em quyt nh la chn phng php ghp ni diphone xy dng h thng TTS ting Vit.Cng vi nhng c im ca ting Vit trn, ti quyt nh p dng phng php ny vi ting Vit khng du v thay i tn s chung F0 to ra thanh iu cho ting ni c tng hp.Chng 3: Xy dng phn mm3.1 Cc cng vic phi thc hin xy dng chng trnh Xy dng c s d liu m thanh ting vit Code phn mn: Phn tch vn bn u vo Tng hp cc diphoneChng 4:Tng kt4.1 Cc cng vic thc hin: code phn tch vn bn u voT vn bn u vo em phn tch vn bn thnh diphone theo nguyn tc: Mi t trong ting Vit c th c to nn t 2 diphone, trong mt diphone ta s l diphone trc v diphone cn li l sau, c k hiu thm k t _ phn bit. V d: sai = _sa + ai_ trong : _sa l diphone trc v ai_ l diphone sau. Cc diphone trc gm mt ph m ri ti mt nguyn m. V d: _cha

Cc diphone sau gm mt nhm (mt hoc nhiu) nguyn m ri ti ph m. V d nh au_.4.2 Vic tip theo thc hin: Thu m to c s d liu: thu m v tch diphone bng cng c cool edit Pro Code phn mm: c file wave diphone t csdl, tng hp diphone d trn phng php TD-PSOLA