60
VLSI Digital Signal Processing Systems Pipelined and Parallel Recursive and Adaptive Filters Lan-Da Van (范倫達), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2010 [email protected] http://www.cs.nctu.tw/~ldvan/

Pipelined and Parallel Recursive and Adaptive Filtersviplab.cs.nctu.edu.tw/course/VLSI_DSP2010_Fall/VLSIDSP... · 2011. 1. 4. · 2 type of digital filters for time invariant system:

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

  • VLSI Digital Signal Processing Systems

    Pipelined and Parallel Recursive and Adaptive Filters

    Lan-Da Van (范倫達), Ph. D.

    Department of Computer Science

    National Chiao Tung University

    Taiwan, R.O.C.

    Fall, 2010

    [email protected]

    http://www.cs.nctu.tw/~ldvan/

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-2

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for IIR

    Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-3

    Introduction

    2 type of digital filters for time invariant system:

    FIR

    IIR

    IIR is preferred since it can be implement in a much

    lower order.

    Pipelining technique:

    look-ahead computation and incremental block processing

    techniques

    relaxed look-ahead transformations for pipelining of LMS

    and lattice adaptive filters

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-4

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for IIR

    Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-5

    Pipeline interleaving in Digital Filters

    Inefficient single/multi-channel interleaving

    Efficient single-channel interleaving

    Efficient multi-channel interleaving

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-6

    Inefficient Single/Multi-channel Interleaving

    Consider

    y(n+1)=ay(n)+bu(n)

    Iteration period= am TT

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-7

    Inefficient Single/Multi-channel Interleaving(cont.)

    M-stage pipeline version:insert (M-1) additional

    latched

    Clock period decrease M times(ex:M=5)

    For a single time series this array will be useful for

    only 20% of the time

    If 5 independent time series are available ==> fully

    utilized

    Consequence:

    A sample rate M times slower than the clock rate

    inefficient utilization of processing elements

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-8

    Inefficient Single/Multi-channel Interleaving(cont.)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-9

    Efficient Single-Channel Interleaving

    Using look-ahead transform.

    Consider:

    y(n+2)=a[ay(n)+bu(n)]+bu(n+1)

    Iteration bound=2(Tm+Ta)/2

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-10

    Efficient Single-channel Interleaving(cont.)

    Another recursion equivalent equation:

    y(n+2)=a2y(n)+abu(n)+bu(n+1)

    Iteration period bound: (Tm+Ta)/2

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-11

    Efficient Single-channel Interleaving(cont.)

    (M-1)steps of look-ahead

    Iteration bound: (Tm+Ta)/M

    1

    0

    )1()()(M

    i

    iM iMnbuanyaMny

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-12

    Efficient Single-channel Interleaving(cont.)

    In a causal system, to perform as before:

    y(1)=ay(0)+b(0)

    IF M=5

    y(1)=a5y(-4)+bu(0)

    (u(-4),u(-3),…,u(-1)=0 for causality)

    y(-4)=a-4y(0)

    y(-i)=a-iy(0), i=1,2,….,(M-1)

    The iteration bound can be achieved by retiming or cutset transform

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-13

    Efficient Multi-channel Interleaving

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-14

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for IIR

    Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-15

    Example 10.5.1 (1/3)

    Consider the following transfer function

    )34()24()14()4()4()44(

    4

    )3()2()1()()()4(

    )1()()(

    4)ure(Larchitecht parallel-4consider , )()()1(

    1)(

    1234

    1234

    1

    0

    1

    1

    kukuakuakuakyaky

    knngsubstituti

    nunuanuanuanyany

    iMnbuanyaMnyfrom

    nunayny

    az

    zzH

    M

    i

    iM

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-16

    Example 10.5.1 (2/3)

    pole: z=a ==> z=a4

    since |a|

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-17

    u(4k+2)

    u(4k+1)

    u(4k)

    Next:u(4k+7),u(4k+8),u(4k+9),u(4k+10)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-18

    Parallel Processing for IIR Filters

    Need L2 multiply-add operations: hardware cost is

    high

    use incremental block processing to simplify

    hardware:

    use y(4k) to compute y(4k+1)

    use y(4k+1) to compute y(4k+2)

    use y(4k+2) to compute y(4k+3)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-19

    Example 10.5.2 (1/2)

    incremental block processing

    since y(n+1)=ay(n)+u(n)

    y(4k+1)=ay(4k)+u(4k)

    y(4k+2)=ay(4k+1)+u(4k+1)

    y(4k+3)=ay(4k+2)+u(4k+2)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-20

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-21

    Example 10.5.3 (1/3)

    System Transfer Function:

    L=3

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-22

    Example 10.5.3 (2/3)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-23

    y(3k+4)

    y(3k+3)

    Example 10.5.3 (3/3)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-24

    Parallel Processing for IIR Filters

    For N-th order IIR filter, its L-level incremental parallel

    processing architecture can be obtained by

    computing the first N output samples

    y(Lk),y(Lk+1),…,y(LK+N-1) independently using loop

    update equations obtained by clustered look-ahead

    technique and then computing the remaining L-N

    samples y(Lk+N)…,y(LK+L-1) incrementally using the

    previous N output samples

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-25

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for

    IIR Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-26

    Example 10.6.1 (1/3)

    Consider: M=4,L=3

    11

    1)(

    azzH

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-27

    Example 10.6.1 (2/3)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-28

    f1(3k+9) f2(3k+6)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-29

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for IIR

    Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-30

    Example 10.7.1 (1/2)

    Consider: V0=5V,Vt=1V

    )5175.01337.11)(1777.04085.01()(

    )7194.05524.01)(4216.01188.11(

    )8482.04996.11)(6493.05548.11(

    )1(001836.0)(

    :iondecomposit 2-of-power with techniqueahead-look scatteredby

    )8482.04996.11)(6493.05548.11(

    )1(001836.0)(

    8484

    4242

    2221

    41

    2121

    41

    zzzzzD

    zzZz

    zzZz

    zzN

    zzzz

    zzH

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-31

    5891.05

    38.2

    5

    13

    13 where,38.238.2

    5 where, 55

    38.2,476.0,

    )15(

    5

    4

    1where,tagesupply vollower use can

    ecapacitancsmaller path creticalshorter :pipeline level-4

    )15(

    5

    )(

    2

    22)(

    22)(

    0

    2

    arg

    0

    arg

    0

    0

    seq

    pip

    pipsMpip

    pip

    pip

    totalpar

    seqsMseq

    seq

    totalseq

    pip

    ech

    pip

    ech

    t

    total

    P

    PRatio

    mfCmT

    CP

    mfCmT

    CP

    VVTT

    k

    CT

    V

    k

    C

    VVk

    VCT

    Example 10.7.1 (2/2)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-32

    Consider L=3

    %116.293

    4

    43

    )(12

    3

    3365.2,4673.03 since

    )(

    )(

    delay npropogatio

    6.0,5

    )2()1(2)()2(8

    3)1(

    4

    5)(

    2

    222

    2

    2

    2

    seq

    par

    soMs

    oMpar

    soMseq

    oseqpar

    to

    oMpar

    to

    oMseq

    to

    P

    PRatio

    fVCf

    VCP

    fVCP

    VVTT

    VVk

    VCT

    VVk

    VCT

    VVVV

    nunununynyny

    Example 10.7.2 (1/2)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-33

    Example 10.7.2 (2/2)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-34

    Outline

    Introduction

    Pipeline Interleaving in Digital Filter

    Parallel Processing for IIR Filter

    Combined Pipelining and Parallel Processing for IIR

    Filters

    Low-Power IIR Filter Design Using Pipelining and

    Parallel Processing

    Pipelined Adaptive Digital Filters

    Conclusions

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-35

    Introduction to an Adaptive Algorithm

    Adaptive filter: filter with adaptive coefficient

    general filter block

    coefficient update block

    Widely used in communication, DSP, and control system

    Deterministic gradient / least square algorithm

    Steepest descent algorithm

    RLS algorithm

    Stochastic gradient algorithm

    LMS algorithm, DLMS algorithm

    Block LMS algorithm

    Gradient Lattice algorithm

    Difficult to pipeline due to the presence of long feedback loops

    Relaxed look-ahead transformation technique is used to pipeline

    the adaptive filter with little or no increase in hardware at the

    expense of marginal degradation in the adaptive behavior

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-36

    Adaptive Applications

    Channel equalizer

    System identification

    Image enhancement

    Echo canceller

    Noise cancellation

    Predictor

    Line enhancement

    Beamformer

    MIMO-ODFM: V-BLAST

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-37

    Notation

    )Error: e(n

    Factor: μAdaptation

    tor: W(n)Weight Vec

    tput: d(n)Desired Ou

    al: X(n)Input Sign

    .5

    .4

    .3

    .2

    .1

    : .10

    .9

    .8

    .7

    .6

    MatrixDiagonal

    :λEigenvalue

    ix: Ration MatrAutocorrel

    : NTap Number

    ent: MMisadjustm adj

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-38

    Steepest Descent Algorithm

    )()()( nXnWny T

    TNnxnxnxnXwhere )]1( ... )1( )([)(

    TN nwnwnwnW )]( ... )( )([)( 110

    Note:

    SSS (Strict-sense stationary): A stochastic process x(t) is called

    SSS if its statistical properties are invariant to a shift of the origin.

    WSS (Wide-sense stationary):

    )()}()({

    )}({

    *

    RtxtxE

    txE

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-39

    Steepest Descent Algorithm (1/8)

    )()()(

    )()()(

    )()()(

    nWnXnd

    nXnWnd

    nyndne

    T

    T

    )()()(

    )()()(2)()( 22

    nWnXnXW

    nWnXndndne

    TT

    T

    The error at the n-th time is

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-40

    Steepest Descent Algorithm (2/8)

    )()()(2)}({

    )()}()({)(

    )()}()({2)}({

    )}({ )(

    2

    2

    2

    nRWnWnWPndE

    nWnXnXEnW

    nWnXndEndE

    neEnJ

    TT

    TT

    T

    )1()(

    .

    .

    )1()(

    )()(

    )}()({

    Nnxnd

    nxnd

    nxnd

    EnXndEP

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-41

    Steepest Descent Algorithm (3/8)

    )0(...)2()1(

    ....

    .....

    ......

    )2(...)0()1(

    )1(...)1()0(

    )}()({

    **

    *

    rLrLr

    Lrrr

    Lrrr

    nXnXER T

    )}()({ )( * nxnxEr

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-42

    Steepest Descent Algorithm (4/8)

    RWPJ 22

    0J

    PRWopt1

    Take “gradient”

    Let

    Wiener-Hopf equation (1949) or Wiener Filter can

    be obtained!!

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-43

    Steepest Descent Algorithm (5/8)

    opt

    TWPndEJ )}({ 2min

    )})(())({( min optT

    opt WnWRWnWEJJ

    optWnWnC )( )(

    )}()({ min nRCnCEJJT

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-44

    Steepest Descent Algorithm (6/8)

    SignalInput for

    Definitive Positive and Symmetric :Assume

    RQ

    diag

    QQQQR

    L

    T

    ofmatrix orthogonal are and

    ]... [ where 110

    1

    )()( where

    }{

    )}()({)(

    1

    min

    1

    min

    nCQnV

    VVEJ

    nCQQnCEJnJ

    T

    T

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-45

    Steepest Descent Algorithm (7/8)

    )()(

    ))((2

    1)()1(

    RWPnW

    nJnWnW

    Each change in the weight vector proportional to

    the negative of the gradient vector

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-46

    Steepest Descent Algorithm (8/8)

    opt

    opt

    WnWnC

    PRW

    nJnWnW

    )()(

    )(2

    1)()1(

    1

    )0()-(IV(n)

    0)()I()1(V

    )()()1(

    V

    nVn

    nVnVnV

    n

    kk allfor 1)1(0 max

    10

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-47

    LMS Algorithm

    ))(ˆ(2

    1)()1( nJnWnW

    )()(2)(ˆ nXnenJ

    μe(n)X(n)W(n))W(n 1

    w0’, w1, w0

    An efficient implementation in software of steepest

    descent using measured or estimated gradients

    The gradient of the square of a single error sample

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-48

    )()()( nyndne

    Summary of LMS Adaptive Algorithm (1960)

    )()()( nnny T xw

    )()( )()1( nnenn xww

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-49

    Block Diagram of an Adaptive FIR Filter Driven by the LMS Algorithm

    1z)(nx 1z 1z

    )(0 nw )(2 nw)(1 nw )(1 nwN

    )1( nx )2( nx )1( Nnx

    )(nd)(ny

    )(ne

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-50

    Adaptive Digital Filter Structure

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-51

    Relaxed Look-Ahead (1/6)

    Consider:

    )1()1()]1([

    )()1()(

    )()()()1(

    1

    1

    1

    0

    1

    0

    MnuiMnujMna

    nyiMnaMny

    nunynany

    M

    i

    i

    j

    M

    i

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-52

    Relaxed Look-Ahead (2/6)

    Product Relaxation

    1

    0

    1

    0

    1

    0

    )1()()))1(1(1()(

    )1()1(])1([

    zero toclose is)( if

    ))1(1(1)1(1

    ))1(1()1()(

    unity toclose is a(n) if

    M

    i

    i

    j

    MMM

    i

    iMnunyMnaMMny

    iMnuiMnujMna

    inu

    MnaMMnM

    MnMnaina

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-53

    Relaxed Look-Ahead (3/6)

    Another approximation:

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-54

    Relaxed Look-Ahead (4/6)

    Sum relaxation

    if u(n) varies slowly over M cycles then

    if u(n) is close to zero the Mu(n) can be approximated by u(n)

    1

    0)()1(

    M

    inMuiMnu

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-55

    Relaxed Look-Ahead (5/6)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-56

    Delay relaxation

    M-level look-ahead pipelined version

    assume that the product a(n)u(n) is more or less constant over

    M’ samples

    Relaxed Look-Ahead (6/6)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-57

    Pipelined LMS Adaptive Filter (1/3)

    1'

    0

    112

    221

    1

    0

    112

    1

    0

    2

    2

    2

    2

    2

    )()()()(

    :M'M relaxation sum usesamples,Mover

    much changenot does e(n)U(n) estimategradient that theassume

    )()()()(

    :relaxationdelay

    )()()()(

    ahead-look stage-

    )()()1()(

    e(n) minimize to

    )()1()()(ˆ)()(

    M

    i

    M

    i

    M

    i

    T

    iMnUiMneMnWnW

    iMnUiMneMnWnW

    inUineMnWnW

    M

    nUnenWnW

    nUnWndndndne

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-58

    Pipelined LMS Adaptive Filter (2/3)

    )()()()(

    )(by)1( replace and smallly sufficient is assume

    )(])1()1(

    )1([)(

    )()1()()(ˆ)()(

    2

    22

    1'

    0

    11

    2

    2

    nUMnWndne

    MnWMnW

    nUiMnUiMne

    MnWnd

    nUnWndndndne

    T

    M

    i

    T

    T

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-59

    Pipelined LMS Adaptive Filter (3/3)

  • VLSI Digital Signal Processing Systems

    Lan-Da Van VLSI-DSP-10-60

    Conclusions

    Introduce the pipeline filter design

    Introduce the parallel filter design

    Explore Relax-Look ahead techniques

    Demonstrate the concept and design of adaptive

    filters