JuliaTokyo #3 Speech Signal Processing in Julia

Juliaで音声信号処理をしばらくやってみた

山本りゅういち (@r9y9)

2015/04/25 JuliaTokyo #3

ノートブックとか: https://github.com/r9y9/JuliaTokyo3

自己紹介

• 山本りゅういち (@r9y9)

– 音声/音楽の信号処理、機械学習が好き

– コンピュータビジョン（初心者）

– Julia歴 8ヶ月くらい

• ブログ– LESS IS MORE http://r9y9.github.io/

今日の話

1. 音声信号処理に便利なパッケージの紹介

2. 僕が書いたパッケージの紹介

dancasimiro/WAV.jl

https://github.com/dancasimiro/WAV.jl

WAVファイルの読み込みusing WAVx, fs = wavread("test16k.wav")

JuliaDSP/DSP.jl

https://github.com/JuliaDSP/DSP.jl

スペクトログラム

バンドパスフィルタを掛ける

r9y9/WORLD.jl

https://github.com/r9y9/WORLD.jl

基本周波数 (Fundamental Frequency; F0)

スペクトル包絡 (Spectrum Envelope)

非周期性指標 (Aperiodicity ratio)

音声波形の再合成

※誤差は、分析条件、分析方法に依存します

WORLDの良いところ

• 高品質

– 業界デファクトのSTRAIGHTと同等

• BSDライセンス

• 高速

応用

Robust PCA による歌声分離

混合音のスペクトログラム

低ランク行列

スパースな行列 (歌声)

Huang, Po-Sen, et al. "Singing-voice separation from monaural recordings using robust principal component analysis." ICASSP 2012.

https://github.com/r9y9/RobustPCA.jl

統計的声質変換

http://r9y9.github.io/blog/2014/11/12/statistical-voice-conversion-code/

画像＋スペクトログラム

所感

• 既存のCライブラリを活用したい場合にラッパーを書く

– ccallかんたん慣れれば簡単に感じます（当たり前

• Juliaは本当に速かった

– 例えば反復計算を含むアルゴリズムで、C実装のたかだか1.3倍程度

– https://github.com/r9y9/MelGeneralizedCepstrums.jl/blob/35feece580fb121803ed6ace7f80e6b694c9aa69/perf/mgcep.jl

• 音声信号処理全然できるよ！

• パッケージは必要なら自分で書く！！

僕が音声系で使う/作ったパッケージ一覧

• dancasimiro/WAVWAVファイルの読み込み

• JuliaDSP/DSP窓関数、スペクトログラム、STFT、デジタルフィルタ

• r9y9/WORLD音声分析・合成フレームワーク

• r9y9/MelGeneralizedCepstrumsメル一般化ケプストラム分析r9y9/SynthesisFiltersメル一般化ケプストラムからの波形合成

• r9y9/SPTK音声信号処理ツールキット

• r9y9/RobustPCAロバスト主成分分析(歌声分離へ応用)

• r9y9/REAPER基本周波数推定

• r9y9/VoiceConversion統計的声質変換

※発表中に紹介しなかったものも含む。僕が書いた公式パッケージは現在WORLDのみです。上から順に、汎用的（だと思います）

JuliaTokyo #3 Speech Signal Processing in Julia

Technology

Part of Speech Tagging - University Of Marylandusers.umiacs.umd.edu/~jbg/teaching/CMSC_470/10b_viterbi.pdf · Part of Speech Tagging Natural Language Processing: Jordan Boyd-Graber

20130608-Speech Recognition and its Applications to Computer …berlin.csie.ntnu.edu.tw/Berlin_Research/Talks/20130608... · 2013-06-08 · Text Processing vs. Speech Processing Recognition,

Digital Signal Processing - cs.toronto.edufrank/Download/NeuroSpeech2018.pdf · Speech processing. Python. Software. A new software for modeling pathological speech signals is presented

Why Machine Translation?Machine translation Translation from language EN to CH Translation from language CH to EN Speech processing Speech recognition Text to speech Image understanding

Fundamentals of Speech Signal Processing. 1.0 1.0 Speech Signals

Speech Processing and Nonlinear Signal Processing · Codes bzw. der iterativen Decodierung auf Kanal angepaßte Übertragungsverfahren anzuwenden. Im Bereich der Quellencodierung

Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time

Introduction of Mecab.jl #JuliaTokyo

Vorlesung SS 2012 Multilinguale Mensch-Maschine Kommunikation · •Praktikum: Multilingual Speech Processing (Schultz) ... • Speech input is still more expensive than keyboard

蔡政昱吳全勳 2014/5/21 Digital Speech Processing Homework 3

Digital Speech Processing— Lecture 16 speech... · 1 Digital Speech Processing— Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models—Adaptive

Speech Separationspeech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/SP (v3).pdfin IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019 •[hoi, et al., ILR’] Hyeong -Seok

임성신sslim@pusan.ac.kr Speech and Language Processing Ch8. WORD CLASSES AND PART-OF- SPEECH TAGGING

Julia 100 exercises #JuliaTokyo

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches)

DRAFT – SPEECH PROCESSING FOR DIGITAL HOME ASSISTANTS ...€¦ · draft – speech processing for digital home assistants – draft 3 its ability to operate with low latency, creating

Digital Speech Processing數位語音處理概論李琳山. Speech Signal Processing Major Application Areas 1. Speech Coding:Digitization and Compression Considerations :

Digitl Speech Processing 數位語音處理概論李琳山

Temporal Compression Of Speech: An Evaluation IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 4, MAY 2008 Simon Tucker and Steve

Speech and Language Processing Lecture 12—02/24/2015 Susan W. Brown