Chapter 11Chapter 11 DSP Algorithm Implementationscwlin/courses/dsp/notes/ch11_Mitra_dsp_c.pdfChapter 11Chapter 11 DSP Algorithm Implementations

Chapter 11Chapter 11

DSP Algorithm ImplementationsImplementations

[email protected]

The McGraw-Hill Companies, Inc., 2007Original PowerPoint slides prepared by S. K. Mitra

03-5731152 11-1

Matrix Representation of Digital Filter Structures

ConsiderCo s de

This structure is described by the set of equationsw1[n] = x[n] w5[n]1[ ] [ ] 5[ ]

w2[n] = w1[n] w3[n]w3[n] = w2 [n1]

w4[n] = w3[n] + w2[n]w5[n] = w4 [n1]

y[n] = w [n] + w [n]

The McGraw-Hill Companies, Inc., 2007Original PowerPoint slides prepared by S. K. Mitra 11-2

y[n] = w1[n] + w5[n]

Matrix Representation of Digital Filter Structures

This ordered set of equations is said to be noncomputables o de ed set o equat o s s sa d to be o co putab ebecause some variables on the left side are computed before some right-side variables are made available

For example, computation of w1[n] in the 1st step requires the knowledge of w5[n] which is computed in the 5th step

Likewise computation of w [n] in the 2nd step requires the Likewise, computation of w2[n] in the 2nd step requires the knowledge of w3[n] that is computed in the 3rd step

Suppose we reorder the equationspp qw3[n] = w2 [n1]w5[n] = w4 [n1]

w1[n] = x[n] w5[n]w2[n] = w1[n] w3[n]w [n] = w [n] + w [n]


w4[n] = w3[n] + w2[n]y[n] = w1[n] + w5[n]

Matrix RepresentationMatrix Representation This new ordered set of equations is computables e o de ed set o equat o s s co putab e In most practical applications, equations describing a digital

filter structure can be put into a computable order by inspection

A simple way to examine the computability of equations describing a digital filter structure is by writing the equationsdescribing a digital filter structure is by writing the equations in a matrix form

A matrix representation of the first ordered set of equations:p q


Matrix RepresentationMatrix Representation In compact formco pact o

y[n] = x[n] + F y[n] + G y[n1]where

y[n] = [w1[n] w2[n] w3[n] w4[n] w5[n] y[n]]T

x[n] = [x[n] 0 0 0 0 0]T

If a diagonal element of F is nonzero then computation of If a diagonal element of F is nonzero, then computation of present value of the corresponding variable requires the knowledge of its present value implying presence of a


g p p y g pdelay-free loop

Matrix RepresentationMatrix Representation Any nonzero entries in the same row above the main y o e o e t es t e sa e o abo e t e a

diagonal of F imply that the computation of present value of the corresponding variable requires present values of other variables not yet computed, making the set of equations noncomputable

Hence for computability all elements of F matrix on the Hence, for computability all elements of F matrix on the diagonal and above diagonal must be zeros

In the F matrix for the first ordered set of equations, q ,diagonal elements are all zeros, indicating absence of delay-free loops

However, there are nonzero entries above the diagonal in the first and second rows of F indicating that the set of equations are not in proper order for computation


equations are not in proper order for computation

Matrix RepresentationMatrix Representation The F matrix for the second ordered set of equations is e at o t e seco d o de ed set o equat o s s

which is seen to satisfy the computability conditiony p y


Precedence GraphPrecedence Graph The precedence graph can be used to test the

t bilit f di it l filt t t d t d l thcomputability of a digital filter structure and to develop the proper ordering sequence for a set of equations describing a computable structurea computable structure

It is developed from the signal-flow graph description of the digital filter structure in which independent and g pdependent signal variables are represented by nodes, and the multiplier and delay branches are represented by di t d b hdirected branches

The directed branch has an attached symbol denoting the branch gain or transmittancebranch gain or transmittance

For a multiplier branch, the branch gain is the multiplier coefficient value


For a delay branch, the branch gain is simply z1

Precedence GraphPrecedence Graph The signal-flow graph representation of

is shown below


Precedence GraphPrecedence Graph A reduced signal-flow graph is then developed by

i th d l b h d ll b h i t fremoving the delay branches and all branches going out of the input node

The reduced signal-flow graph of the example digital filter The reduced signal-flow graph of the example digital filter structure is shown below


Precedence GraphPrecedence Graph The nodes in the reduced signal-flow graph are grouped as

f llfollows: All nodes with only outgoing branches are grouped into one

set labeled {N }set labeled {N1} Next, the set {N2} is formed containing nodes coming in

only from one or more nodes in the set {N1} and have y { 1}outgoing branches to the other nodes

Then, form the set {N3} containing nodes that have b h i i l f d i thbranches coming in only from one or more nodes in the sets {N1} and {N2}, and have outgoing branches to other nodesnodes

Continue the process until there is a set of nodes {Nf} containing only incoming branches


g y g The rearranged graph is called a precedence graph

Precedence GraphPrecedence Graph Since signal variables belonging to {N1} do not depend on

th t l f th i l i bl th i blthe present values of other signal variables, these variables should be computed first

Next signal variables belonging to {N } can be computed Next, signal variables belonging to {N2} can be computed since they depend on the present values of signal variables contained in {N1} that have already been computed{ 1} y p

This is followed by the computation of signal variables in {N3}, {N4}, etc.Fi ll i th l t t th i l i bl i {N } Finally, in the last step the signal variables in {Nf} are computed

This process of sequential computation ensures the This process of sequential computation ensures the development of a valid computational algorithm

If there is no final set {Nf} containing only incoming


{ f} g y gbranches, the digital filter structure is noncomputable

Precedence GraphPrecedence Graph

For the example precedence graph, pertinent groupings of d i blnode variables are:

{N1} = {w3[n], w5[n]}{N } = {w [n]}{N2} = {w1[n]}{N3} = {w2[n]}

{N4} = {w4[n], y[n]}


{ 4} { 4[ ] y[ ]}

Precedence GraphPrecedence Graph Precedence graph redrawn according to the above

i i h b lgroupings is as shown below

Since the final node set {N } has only incoming branches Since the final node set {N4} has only incoming branches, the structure is computable


Structure VerificationStructure Verification A simple method to verify that the structure developed is

i d d h t i d b th ib d t f f tiindeed characterized by the prescribed transfer function H(z)

Consider for simplicity a causal 3rd order IIR transfer Consider for simplicity a causal 3rd order IIR transfer function

If {h[n]} denotes its impulse response, then

Note P(z) = H(z)D(z) which is equivalent to Note P(z) = H(z)D(z), which is equivalent to


Structure VerificationStructure Verification Evaluate above convolution sum for 0 n 6:

h[0]p0 = h[0]p1 = h[1] + h[0]d1p = h[2] + h[1]d + h[0]dp2 = h[2] + h[1]d1 + h[0]d2p3 = h[3] + h[2]d1 + h[1]d2 + h[0]d30 = h[4] + h[3]d1 + h[2]d2 + h[1]d3[ ] [ ] 1 [ ] 2 [ ] 30 = h[5] + h[4]d1 + h[3]d2 + h[2]d30 = h[6] + h[5]d1 + h[4]d2 + h[3]d3

I t i f t In matrix form we get


Structure VerificationStructure Verification In partitioned form above matrix equation can be written as

Where

Solving second equation we get

Substituting above in the first equation we get

In the case of an N-th order IIR filter, the coefficients of its transfer function can be determined from the first 2N+1


transfer function can be determined from the first 2N+1 impulse response samples

Structure VerificationStructure Verification Example: Consider the causal transfer function

Here Hereh[0] = 2, h[1] = 4, h[2] = 5, h[3] = 3, h[4] =13

HenceHence

Solving the above we get Solving the above we get


Four Classes of Fourier TransformsFour Classes of Fourier TransformsContinuous in Time Discrete in time-Periodic in frequencyContinuous in Time Discrete in time Periodic in frequency

FT DTFT; T0f(k) F(ejT0)

in

t k

ntin

uous

ue

ncy

t

0

2T

Con

freq

= dejFtf tj

)(

21)( =

T

T

TjkTj deeFTkf 00

00 )(2

)( 0

= dtetfF tj )()(

=

=k

TjkTj ekfeF 00 )()(


Fourier transform Discrete time Fourier Transform

Four Classes of Fourier TransformsFour Classes of Fourier TransformsContinuous in Time Discrete in time-Periodic in frequency

dic

in

q y

f(t) F(n) F(k) F(n)

pe

riod

k

2

t n0 n

quen

cy

1 21 N

0 FSi0 DFT:N

ete

in fr

e

=

= 0)()(

enFtfn

tjn

=

=

1 2

1

0

2

)(1)(

N

N

n

knN

jenF

Nkf

Dis

cre

time = 0

0

0)(2

)( 0

dtetfnF tjn

=

=

1

0

2

)()(N

k

knN

jekfnF


Fourier series Discrete Fourier Transform

Direct Computation of DFTDirect Computation of DFTCC DFT SUBROUTINE

For a complex-valued sequence of Ni h DFT b d C ISEL = 0 : DFT

C ISEL = 1 : INVERSE DFTC

SUBROUTINE DFT(N, XR, XI, XFR, XFI, ISEL)DIMENSION XR(N), XI(N), XFR(N), XFI(N)

=

+=1

0

2sin)(2cos)()(N

niRR N

knnxNknnxkX

points the DFT may be expressed as

The direct computation

DIMENSION XR(N), XI(N), XFR(N), XFI(N)WN = 6.2831853 / FLOAT(N)IF (ISEL.EQ.1) WN = - WNDO 20 K = 1, N

XFR(K) = 0,XFI(K) 0

=

=1

0

2cos)(2sin)()(N

niRI N

knnxNknnxkX

The direct computation requires:

2N2 evaluations of trigonometric

XFI(K) = 0,KM1 = K 1DO 20 I = 1, N

IM1 = I 1ARG = WN * KM1 * IM1

functions. 4N2 real multiplications. 4N(N 1 ) real additions

C = COS(ARG)S = SIN(ARG)XFR(K) = XFR(K) + XR(I)*C + XI(I)*SXFI(K) = XFI(K) XR(I)*S + XI(I)*C

10 CONTINUE 4N(N 1 ) real additions. A number of indexing and

addressing operations.

0 CO UIF (ISEL 1) 20, 30, 20

30 XFR(K) = XFR(K) / FLOAT(N)20 XFI(K) = XFI(K) / FLOAT(N)

CONTINUERETURN


RETURNEND

Goertzels AlgorithmGoertzel s Algorithm A recursive DFT computation scheme that makes use of

the identity

obtained using the periodicity of Using this identity we can write


Goertzels AlgorithmGoertzel s Algorithm Define Note: yk[n] is the direct convolution of the causal sequence

with a causal sequence

Ob X[k] [ ]| Observe X[k] = yk[n]|n=N z-transform of yields


Goertzels AlgorithmGoertzel s Algorithm Thus, yk[n] is the output of an initially relaxed LTI digital

filter Hk[z] with an input xe[n] and, when n = N, yk[n] = X[k] Structural interpretation of the algorithm

Thus a recursive DFT computation scheme is

with yk[1] = 0 and xe[N] = 0


Goertzels AlgorithmGoertzel s Algorithm Since a complex multiplication needs 4 real multiplications

and 2 real additions, computation of each new value of yk[n] requires 4 real multiplications and 4 real additions

Thus computation of X[k] = yk[N] involves 4N real multiplications and 4N real additionsC t ti f ll N DFT l i 4N2 l Computation of all N DFT samples requires 4N2 real multiplications and 4N2 real additions

Direct computation of all N samples of {X[k]} requires N2 Direct computation of all N samples of {X[k]} requires N2complex multiplications and N(N-1) complex additions

Equivalently direct computation of all N samples of {X[k]} Equivalently, direct computation of all N samples of {X[k]} requires 4N2 real multiplications and N(4N-2) real additions

Thus, Goertzels algorithm requires 2N more real additions


Thus, Goertzel s algorithm requires 2N more real additions than the direct DFT computation 11-25

Goertzels AlgorithmGoertzel s Algorithm The algorithm can be made computationally more efficient

by observing that Hk[z] can be rewritten as

resulting in a second-order realization


Goertzels AlgorithmGoertzel s Algorithm DFT computation equations are now

Computation of each sample of involves only 2 real multiplications and 4 real additions

Complex multiplication by needs to be performed only once at n = N

Thus, computation of one sample of X[k] requires (2N+4) real multiplications and (4N+4) real additions

Computation of all N DFT samples requires 2N(N+2) real multiplications and 4N(N+1) real additions


Goertzels AlgorithmGoertzel s Algorithm In realizing HNk[z], the multiplier in the feedback path is

2cos(2(N k)/N) = 2cos(2k/N)which is same as that in the realization of Hk[z]k[ ] Nk[n] = k[n], i.e., the intermediate variables computed to determine X[k] can again be used to determine X[Nk]

Only difference between the two structures is the feed-forward multiplier which is now that is the complex

j t fconjugate of Thus, computation of X[k] and X[Nk] require 2(N+4) real

multiplications and 4(N+2) real additionsmultiplications and 4(N+2) real additions Computation of all N DFT samples require approximately N2

real multiplications and approximately 2N2 real additions


real multiplications and approximately 2N2 real additions11-28

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Consider a sequence x[n] of length N = 2

Using a 2-band polyphase decomposition we can express its z-transform as

X( ) X ( 2) 1X ( 2)X(z) = X0(z2) + z1X1(z2)where

Evaluating on the unit circle at N equally spaced pointswe arrive at the N-point DFT of x[n]:p [ ]

where X0[k] and X1[k] are the (N/2)-point DFTs of x0[n] and


0[ ] 1[ ] ( ) p 0[ ]x1[n] 11-29

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm That is

Block diagram interpretation:


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Flow-graph representation


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Direct computation of the N-point DFT requires N2 complex

lti li ti d N2 N N2 l dditimultiplications and N2 N N2 complex additions The modified scheme requires the computation of two

(N/2) point DFTs that are then combined ith N comple(N/2)-point DFTs that are then combined with N complex multiplications and N complex additions resulting in a total of (N2/2) + N complex multiplications and approximatelyof (N /2) N complex multiplications and approximately (N2/2) + N complex additions

For N 3, (N2/2) + N < N2, ( )


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Continuing the process we can express and as a weighted

bi ti f t (N/4) i t DFTcombination of two (N/4)-point DFTs For example, we can write

where X00[k] and X01[k] are the (N/4)-point DFTs of the (N/4) l th [ ] [2 ] d [ ] [2 1](N/4)-length sequences x00[n] = x0 [2n] and x01[n] = x0 [2n+1]

Likewise, we can express

where X10[k] and X11[k] are the (N/4)-point DFTs of the (N/4)-length sequences x10[n] = x1 [2n] and x11[n] = x1 [2n+1]


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Block-diagram representation of the two-stage algorithm


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Flow-graph representation


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm In the flow-graph shown N = 8 Hence, the (N/4)-point DFT here is a 2-point DFT and no

further decomposition is possible The four 2-point DFTs, Xij[k], i, j = 0,1 can be easily

computedF l For example

C di fl h f th 2 i t DFT i h Corresponding flow-graph of the 2-point DFT is shown below obtained using the identity


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Complete flow-graph of the 8-point DFT is shown below


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm The flow-graph consists of 3 stages First stage computes the four 2-point DFTs Second stage computes the two 4-point DFTs Last stage computes the desired 8-point DFT The number of complex multiplications and additions at

each stage is equal to 8, the size of the DFT Total number of complex multiplications and additions to

t ll 8 DFT l i l t 8 8 8 24 8 3compute all 8 DFT samples is equal to 8 + 8 + 8 = 24 = 83 In the general case when N = 2, the number of stages for

computing the N point DFT in the algorithm will be = log Ncomputing the N-point DFT in the algorithm will be = log2N Total number of complex multiplications and additions to

compute all N DFT samples is N(log N)


compute all N DFT samples is N(log2N)11-38

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Note, and Besides, the coefficients have the symmetry property of

These properties can be exploited to reduce the computational complexity further

In the following basic module two output variables are generated by a weighted combination of two input variables

The basic computational module is called a butterfly t ti


computation11-39

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm The input-output relations of the basic module are:

Substituting in the second equation given above we get

The modified butterfly computation requires only one complex multiplication as indicated belowcomplex multiplication as indicated below

Use of the above modified butterfly computation module


reduces the total number of complex multiplications by 50%11-40

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm New flow-graph using the modified butterfly computational

d l f N 8module for N = 8


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm Computational complexity can be reduced further by

idi lti li ti b davoiding multiplications by , , , and

The DFT computation algorithm described here also is efficient with regard to memory requirementsN t E h t l th b tt fl t ti t Note: Each stage employs the same butterfly computation to compute r+1[] and r+1[] from r[] and r[]

At the end of computation at any stage output variables At the end of computation at any stage, output variables r+1[m] can be stored in the same registers previously occupied by the corresponding input variables r[m]occupied by the corresponding input variables r[m]

This type of memory location sharing is called in-placecomputation resulting in significant savings in overall


p g g gmemory requirements

11-42

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm In the DFT computation scheme outlined, the DFT samples

X[k] t th t t i ti l d hil thX[k] appear at the output in a sequential order while the input samples x[n] appear in a different orderTh s a seq entiall ordered inp t [n] m st be reordered Thus, a sequentially ordered input x[n] must be reordered appropriately before the fast algorithm described by this structure can be implementedstructure can be implemented


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm To understand the input reordering scheme, represent the

t f i t l [ ] d th i ti llarguments of input samples x[n] and their sequentially ordered new representations 1[m] in binary forms with the relations between the arguments m and n as follows:relations between the arguments m and n as follows:

Th if (b b b ) t th i d f [ ] th th Thus, if (b2b1b0) represents the index n of x[n], then the sample x[b2b1b0] appears at the location m = b0b1b2 as 1[b0b1b2] before the DFT computation is started1[b0b1b2] before the DFT computation is started

i.e., the location of 1[m] is in bit-reversed order from that of x[n]of x[n]

Alternative forms can be obtained by reordering the computations such as input in normal order and output in bit-


p p preversed order, and both input and output in normal order

11-44

Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm The fast algorithm described assumes that the length of x[n]

i f 2is a power of 2 If it is not, the length can be extended by zero-padding and

make the length a po er of 2make the length a power of 2 Even after zero-padding, the DFT computation based on the

fast algorithm may be computationally more efficient than afast algorithm may be computationally more efficient than a direct DFT computation of the original shorter sequence

The fast DFT computation schemes described are calledThe fast DFT computation schemes described are called decimation-in-time (DIT) fast Fourier transform (FFT)algorithms as input x[n] is first decimated to form a set of subsequences before the DFT is computed


Decimation in Time FFT AlgorithmDecimation-in-Time FFT AlgorithmComplex Complex

Nmultiplications in Direct Computation

N2

multiplications in FFT algorithm,

(N/2)log2N

4 16 4

For N = 2, this decimation can be performed = log2N times.

4 16 4

8 64 12

16 256 32

The total number of

C ( / )16 256 32

32 1,024 80

64 4,096 192

Complex multiplications : (N/2) log2NComplex additions : N log2N

128 16,384 448

256 65,535 1,024

512 262,144 2,304

1,024 1,048,576 5,120


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm For example, the relation between x[n] and its even and odd

t [ ] d [ ] t d b th fi t t f th DITparts, x0[n] and x1[n], generated by the first stage of the DIT algorithm is given by

Likewise, the relation between x[n] and the sequences x00[n], x01[n], x10[n], and x11[n], generated by the two-stage decomposition of the DIT algorithm is given bydecomposition of the DIT algorithm is given by


Decimation in Time FFT AlgorithmDecimation-in-Time FFT Algorithm The subsequences x00[n], x01[n], x10[n], and x11[n] can be

t d di tl b f t f 4 d i tigenerated directly by a factor-of-4 decimation process leading to a single-stage decomposition as shown below:


Radix R FFT AlgorithmRadix-R FFT Algorithm Radix-R FFT algorithm - A each stage the decimation is by

f t f Ra factor of R Depending on N, various combinations of decompositions of

X[k] can be sed to de elop different t pes of DIT FFTX[k] can be used to develop different types of DIT FFT algorithms

If the scheme uses a mixture of decimations by different If the scheme uses a mixture of decimations by different factors, it is called a mixed radix FFT algorithm

For N which is a composite number expressible in the formFor N which is a composite number expressible in the form of a product of integers:

N = r1 r2 r N r1 r2 r the total number of complex multiplications (additions) in a DIT FFT algorithm based on a -stage decomposition is


g g pgiven by

11-49

Decimation in Frequency FFT AlgorithmDecimation-in-Frequency FFT Algorithm For k even

For k odd

We can write

where


Decimation in Frequency FFT AlgorithmDecimation-in-Frequency FFT Algorithm Thus X[2l] and X[2l+1] are the (N/2)-point DFTs of the

l th (N/2) [ ] d [ ]length-(N/2) sequences x0[n] and x1[n], The flow-graph of the first-stage of the DFT:

The output DFT samples appear in a decimated form with The output DFT samples appear in a decimated form with the even-indexed samples appearing as the output of one (N/2)-point DFT and the odd-indexed samples appearing as


(N/2) point DFT and the odd indexed samples appearing as the output of the other (N/2)-point DFT 11-51

Decimation in Frequency FFT AlgorithmDecimation-in-Frequency FFT Algorithm We next express the even- and odd-indexed samples of

h f th t (N/2) i t DFT f t (N/4)each one of the two (N/2)-point DFTs as a sum of two (N/4)-point DFTsThe process is contin ed ntil the smallest DFTs are 2 point The process is continued until the smallest DFTs are 2-point DFTs

The flow graph of the decimation in frequency (DIF) FFT The flow-graph of the decimation-in-frequency (DIF) FFT computation scheme for N = 8


Decimation in Frequency FFT AlgorithmDecimation-in-Frequency FFT Algorithm Computational complexity of the radix-2 DIF FFT algorithm

i th t f th DIT FFT l ithis same as that of the DIT FFT algorithm Various forms of DIF FFT algorithm can similarly be

de elopeddeveloped The DIT and DIF FFT algorithms described here are often

referred to as the Cooley Tukey FFT algorithmsreferred to as the Cooley-Tukey FFT algorithms


Inverse DFT ComputationInverse DFT Computation An FFT algorithm for computing the DFT samples can also

b d t l l t ffi i tl th i DFT (IDFT)be used to calculate efficiently the inverse DFT (IDFT) Consider a length-N sequence x[n] with an N-point DFT X[k] Recall

Multiplying both sides by N and taking the complex conjugate we get

f f Right-hand side of above is the N-point DFT of a sequence X*[k]


Inverse DFT ComputationInverse DFT Computation The desired IDFT x[n] is then obtained as

Inverse DFT computation is shown below:


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms Consider a length-N sequence x[n] with an N-point DFT X[k]

h N N Nwhere N = N1N2 Represent the indices n and k as

Using these index mappings we can write

as


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms Since , , and , we

hhave

where 0 k1 N1 1 and 0 k2 N2 1 The effect of the index mapping is to map the 1-D sequence

x[n] into a 2-D sequence that can be represented as a 2-D array with n specifying the rows and n specifying thearray with n1 specifying the rows and n2 specifying the columns of the array

Inner parentheses of the last equation is seen to be the set Inner parentheses of the last equation is seen to be the set of N1-point DFTs of the N2-columns:


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms Note: The column DFTs can be done in place Next, these row DFTs are multiplied in place by the twiddle

factors yielding

Finally, the outer sum is the set of N2-point DFTs of the y, 2 pcolumns of the array:

The row DFTs, X[k1+N1k1], can again be computed in place1 1 1 g The input x[n] is entered into an array according to the index

map:


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms Likewise, the output DFT samples X[k] need to extracted

f th di t th i dfrom the array according to the index map:

Example: Let N = 8. Choose N1 = 2 and N2 = 4, then

for 0 k1 1 and 0 k2 3for 0 k1 1 and 0 k2 3 2-D array representation of the input is


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms The column DFTs are 2-point DFTs given by

These DFTs require no multiplicationsq p 2-D array of row transforms is

After multiplying by the twiddle factors array becomes


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms Note Finally, the 4-point DFTs of the rows are computed:

Output 2-D array is given by

The process illustrated is precisely the first stage of the DIF FFT algorithm


Cooley Tukey FFT AlgorithmsCooley-Tukey FFT Algorithms By choosing N1 = 4 and N2 = 2 , we get the first stage of the

DIT FFT l ithDIT FFT algorithm Alternate index mappings are given by


Prime Factor AlgorithmsPrime Factor Algorithms Twiddle factors can be eliminated by defining the index

imappings as

To eliminate the twiddle factors we need to express To eliminate the twiddle factors we need to express

N Now

It follows from above that ifAC N = N2, BD N = N1, AD N = BC N = 0


then11-63

Prime Factor AlgorithmsPrime Factor Algorithms One set of coefficients that eliminates the twiddle factors is

i bgiven byA = N2, B = N1, C = N2 N21 N1, D = N1 N1

1N2

Here N 1 denotes the m ltiplicati e in erse of N Here N11 N2 denotes the multiplicative inverse of N1 reduced modulo N2

If N 1 = then N = 1 or in other words N = If N1 1 N2 = then N1 N2 = 1 or, in other words N1 = N2 + 1 where is any integer

For example if N1 = 4 and N2 = 3 then 31 4 = 3 andFor example, if N1 4 and N2 3 , then 3 4 3 and 3 3 4 = 1

Likewise, if N21 N1 = , then N2 = N1 + 1 where is anyLikewise, if N2 N1 , then N2 N1 1 where is any integer


Prime Factor AlgorithmsPrime Factor Algorithms Now, AC N = N2 (N2 N21 N1) N = N2(N1+1) N =

N N N ) NN2N1+N2) N = N2 Similarly, BD N = N1 (N1 N11 N2) N = N1(N2+1) N =

N N +N ) NN1N2+N1) N = N1 Next,

AD N (N N 1 ) N 0AD N = N2 (N1 N11 N2) N = N N = 0 Likewise,

BC N (N N 1 ) N 0BC N = N1 (N2 N21 N1) N = N N = 0 Hence,


Prime Factor AlgorithmsPrime Factor Algorithms Thus,

where

and 0 k1 N1 1 and 0 k2 N2 1


Prime Factor AlgorithmsPrime Factor Algorithms Example: Let N = 12. Choose N1 = 4 and N2 = 3 Then, A = 3, B = 4, C = 3 31 4 = 9, and D = 4 41 3 = 4 The index mappings are

2-D array representation of input is


Prime Factor AlgorithmsPrime Factor Algorithms 4-point transforms of the columns lead to

Final DFT array is

4-point DFTs require no multiplications, whereas the 3-point DFTs require 4 complex multiplications


Thus, the algorithm requires 16 complex multiplications11-68

Chirp z Transform AlgorithmChirp z-Transform Algorithm Let x[n] be a length-N sequence with a Fourier transform We consider evaluation of M samples of that are equally

spaced in angle on the unit circle at frequenciesk = o + k, 0 k M 1

where the starting frequency o and the frequency i t b h bit ilincrement can be chosen arbitrarily


Chirp z Transform AlgorithmChirp z-Transform Algorithm The problem is thus to evaluate

or ith W defined asor, with W defined asW = ej

T l tTo evaluate

U i th id tit k [ 2 k2 (k )2] it Using the identity nk = [n2 + k2 (kn)2] we can write

Letting We arrive at


Chirp z Transform AlgorithmChirp z-Transform Algorithm Interchanging k and n we get

Th s X(ejn) corresponds to the con ol tion of the Thus, X(ejn) corresponds to the convolution of the sequence g[n] with the sequence Wn2/2 followed by multiplication by the sequence Wn2/2 as indicated belowmultiplication by the sequence W as indicated below

We arrive at

The sequence can be thought of as a complex exponential sequence with linearly increasing frequency


Such signals, in radar systems, are called chirp signals11-71

Chirp z Transform AlgorithmChirp z-Transform Algorithm For the evaluation of

the output of the system depicted earlier need to be computed over a finite interval

Since g[n] is a length-N sequence, only a finite portion of the infinite length sequence Wn2/2 is used in obtaining the convolution sum over the interval 0 n M 1convolution sum over the interval 0 n M 1


Chirp z Transform AlgorithmChirp z-Transform Algorithm Typical signals


Chirp z Transform AlgorithmChirp z-Transform Algorithm The portion of the sequence Wn2/2 used in obtaining the

l ti i f th i t l N 1 M 1convolution sum is from the interval N +1 n M 1

Let

as shown below

It can be seen that


Chirp z Transform AlgorithmChirp z-Transform Algorithm Hence, the computation of the frequency samples X(ejn)

b i d t i FIR filt i di t d b lcan be carried out using an FIR filter as indicated below

where y[n] = X(ejn), 0 n M 1 Advantages

N = M is not required as in FFT algorithms Neither N nor M do not have to be composite numbers Parameters o and are arbitrary Convolution with h[n] can be implemented using FFT


techniques11-75

Chirp z Transform AlgorithmChirp z-Transform Algorithm


Documents

Chapter 11Chapter 11 DSP Algorithm Implementationscwlin/courses/dsp/notes/ch11_Mitra_dsp_c.pdfChapter 11Chapter 11 DSP Algorithm Implementations