Upload
fuller-mccormick
View
29
Download
0
Embed Size (px)
DESCRIPTION
応用数理工学特論 第 9 回 高速フーリエ変換とその並列化. 2006 年 6 月 29 日 計算理工学専攻 山本有作. 今回の講義の目標 (1). FFT の原理と基本的な技法を学ぶ。 信号処理,偏微分方程式の求解など,広い応用範囲 メーカー提供のライブラリ,フリーウェアなどが多数存在 しかし, FFT には用途に応じて様々な変種が存在 実数データの FFT ,分散メモリ向けの FFT ,など 使いたいタイプの FFT が,ライブラリにあるとは限らない。 FFT の原理と基本的な技法を理解 し,必要に応じて既存のソフトウェアを改造して使う力を身に付ける。 - PowerPoint PPT Presentation
Citation preview
92006629
(1)FFTFFTFFTFFTFFT
FFTFFT
(2)FFT
FFT
FFTCooley-Tukey FFTStockham FFTFFTFFTFFTFFTFFT
.FFTFFTCooley-Tukey FFTStockham FFTFFTFFTFFTFFT
1.1 (1)DFTN a0a1 aN-1 c0c1 cN-1Discrete Fourier Transform
DFTO(N2)
DFTDFT
DFT
1.1 (2)DFT [0, 2]N f (xn) = an exp (ikx)k=0, 1, , N1 cnDFT cn N an DFT
FFT
1.2FFT (1)DFTFFTNDFT
N = N/2,ej = a2j,oj = a2j+1
kexp(2i(k+N/2)/N) = exp(2ik/N)
N DFT N/2 DFT exp(2ik) DFTFFT; Fast Fourier Transform(*)
1.2FFT (2)FFTFFTNDFTT(N) exp(2ik) 2N3N5N
T(1) = 0
FFT 5N log2 N NDFTT(N) = 2T(N/2) + 5NT(N) = 5N log2 N
1.3Cooley-Tukey FFT (1)DFT*N/2DFTc0c1cN1 (*)
1.3Cooley-Tukey FFT (2)Cooley-Tukey FFTN/2DFT* Cooley-Tukey FFT Cooley & Tukey, 1965
c0c1c7c2c3c4c5c6a0a4a7a2a6a1a5a3N=8Cooley-Tukey FFT0123
1.3Cooley-Tukey FFT (3)Cooley-Tukey FFTCooley-Tukey FFT L+1 L FFTin-place FFT c0c1c7c2c3c4c5c6a0a4a7a2a6a1a5a3
1.3Cooley-Tukey FFT (4)Cooley-Tukey FFT{aj}aj j jp-1j1j0 p = log2Najip-1i1i0
ip-1= j0ip-2 = j1 i0 = jp-1{aj}
FFTc0c1c7c2c3c4c5c6a0a4a7a2a6a1a5a3j0 = 0 ip-1 = 0j0 = 1 ip-1 = 1j1 = 0 ip-2 = 0j1 = 1 ip-2 = 1
jp-1 = 0 i0 = 0jp-1 = 1 i0 = 1
1.4Stockham FFT (1)self-sortingFFT
XL (j, k) L= 2L L= 2pL1 XL 2LLXL (j, *) 2LL ajaj+2L aj+2(L1 )L DFT
XL (j, k)L = 0 X0 (j, 0) = ajL = p Xp (0, k) = ck X0 (j, 0) Xp (0, k) X0 X1X2Xp self-sortingFFT
1.4Stockham FFT (2) XL (j, k)DFT*XLXL+1 XL (j, k)
Stockham FFT X0 X1 Xp-1 Xp Stockham FFT
Stockham FFTSelf-sorting In-place N
XL+1 (j, k) = XL (j, k) + XL (j+L, k)NkLXL+1 (j, k+L) = XL (j, k) XL (j+L, k)NkL j = 0, 1, , L1k = 0, 1, , L1
1.4Stockham FFT (3)Stockham FFT
DO 20DO 30
DO 10 L = 0, p1 = 2L = 2pL1 DO 20 k = 0, 1DO 30 j = 0, 1 XL+1 (j, k) = XL (j, k) + XL (j+, k)Nk XL+1 (j, k+) = XL (j, k) XL (j+, k)Nk30 CONTINUE20 CONTINUE10 CONTINUE
1.4Stockham FFT (4)L+1N =128L = 3
N = 8
XL (j, k)XL+1 (j, k)2LLLL2LXL (j+L, k)XL (j, k)XL+1 (j, k)XL+1 (j, k+L)
1.5FFTFFT (1)FFT (I)1.1(1)DFTDFT
FFTFFTN = exp(2i/N) N* N
FFTFFT(1) ck ck* (2) ck* FFT(3) (4) 1/N
DFTDFT
1.5FFTFFT (2)FFT (II)Stockham FFT XL XL+1 L FFT
FFTDO 10 L = p1, 0, 1 = 2L = 2pL1DO 20 k = 0, 1DO 30 j = 0, 1 XL (j, k) = (XL+1 (j, k) + XL+1 (j, k+)) / 2 XL (j+, k) = (XL+1 (j, k) XL+1 (j, k+))Nk / 230 CONTINUE20 CONTINUE10 CONTINUE
1.5FFTFFT (3)FFTDFTN N* N DFTFFT (II) N N* N FFTFFT*FFTFFT
FFTFFTFFT
1.6FFTDFTNxNy {ajx, jy} {ckx, ky} DFT
Ny Nx FFT Nx Ny FFT 5 NxNy log2 (NxNy)
FFTFFTFFT12xy
1.7FFT (1)DFTDFTN = lm jk
NDFT
l m DFT m l DFTj = rl + s s = 0l1 r = 0m1 k = pm + qq = 0m1 p = 0l1
1.7FFT (2)DFTN = lm DFT(1)l m FFT(2) s q exp(2iqs / N) (3)m l FFT
FFT
01234567891011121314150481215913261014371115l mFFTm lFFTajckarl+scpm+q
.
2.1
NEC SX-7 VPP5000 SR2201SR8000Intel Pentium 4
01215SR8000
2.2
Intel Pentium IIIIBM Power 4AMD AthlonDEC Alpha
8128KMB
2.3Stockham FFT N/2N/41
> DO 20 DO 30 O(N 1/2)DO 10 L = 0, p1 = 2L = 2pL1 DO 20 k = 0, 1DO 30 j = 0, 1 XL+1 (j, k) = XL (j, k) + XL (j+, k)Nk XL+1 (j, k+) = XL (j, k) XL (j+, k)Nk30 CONTINUE20 CONTINUE10 CONTINUE
2.4kStockham FFT XL (j, k) XL (j +L, k) L*L = N/2 XL NA NA*L LNAXL (j+L, k)XL (j, k)LXL+1 (j, k+L)XL+1 (j, k)LL
2.5 (1)FFTStockham FFT FFTFFTFFT
FFTXL (j, k)XL+1 (j, k)XL+2 (j, k)LL+1
2.5 (2)FFTL = 2LL = 2 pL2XL+1 (j, k) = XL (j, k) + XL (j+2L, k)2kLXL+1 (j, k+L) = XL (j, k) XL (j+2L, k)2kLXL+1 (j +L, k) = XL (j +L, k) + XL (j+3L, k)2kLXL+1 (j +L, k+L) = XL (j +L, k) XL (j+3L, k)2kLXL+2 (j, k) = XL+1 (j, k) + XL+1 (j+L, k)kLXL+2 (j, k+2L) = XL+1 (j, k) XL+1 (j+L, k)kLXL+2 (j, k +L) = XL+1 (j, k +L) + XL+1 (j+L, k +L) ( k+L ) LXL+2 (j, k+3L) = XL+1 (j, k +L) XL+1 (j+L, k +L) ( k +L ) LLL+1
2.5 (3)FFT L+1L
XL+1 (j, k) = XL (j, k) + XL (j+2L, k)2kLXL+1 (j, k+L) = XL (j, k) XL (j+2L, k)2kLYL+1 (j +L, k) = XL (j +L, k)kL+ XL (j+3L, k)3kLYL+1 (j +L, k+L) = XL (j +L, k) ( k +L ) L XL (j+3L, k) ( 3k +L ) L= iXL (j +L, k)kL + iXL (j+3L, k)3kLXL+2 (j, k) = XL+1 (j, k) + YL+1 (j+L, k)XL+2 (j, k+2L) = XL+1 (j, k) YL+1 (j+L, k) XL+2 (j, k +L) = XL+1 (j, k +L) + YL+1 (j+L, k +L)XL+2 (j, k+3L) = XL+1 (j, k +L) YL+1 (j+L, k +L)LL+1LL = exp (i/2) = i
2.5 (4)FFTFFTByte/Flop
6 / 44 / 46.4
22 / 128 / 83.76 24 / 16
66 / 3216 / 162.61 72 / 48Byte/Flop
2.6 (1)Stockham FFT XL (j, k) N M M < N O(N) FFT O(N log2 N)
FFTM = N 1/2 1.7 (2) NFFT(1) M = N 1/2 M FFT(2) s q exp(2iqs / N) (3) M M FFT
(1)(3) FFT(1)(2)(3) O(N) FFT O(N)
2.6 (2)N =16M = 4
N M r r1FFT MFFT
.FFTFFT
3.1 FFT (1)yxFFTall-to-all broadcast xyFFT
p 5 NxNy log2 (NxNy) / p NxNy / p p1
3.1 FFT (2)xFFTyy
PU0PU2PU1PU1PU1PU3PU0202PU0PU21313
3.2FFT (1)FFTN = NxNy NFFT(1) Ny Nx FFT y(2) jy kx exp(2i kx jy / N) (3) all-to-all broadcast (4) Nx Ny FFT x
FFT01234567891011121314150481215913261014371115kxkykxkyNy NxFFTNx NyFFT+ ajckajxNy+jyckxNx+kyFFT
3.2FFT (2) p 5 N log2 N / p N / p p1
SR80008GFLOPS0.0625Gword/sN = 230
FFTFFTCooley-Tukey FFTXy