37
1 ASIC SYSTEM LAB./AJOU UNIV. VLSI ASIC SYSTEM LAB./AJOU UNIV. Contents Digital Signal Processing Basic Architectures for DSP Algorithms Comparison with Microprocessors Fixed-Point DSP Chips : DSP56100 (Motorola) Multimedia DSP Chips MediaProcessor TriMedia Trends of Future DSPs VLSI Architectures for Communications Fast Fourier Transform Viterbi Decoder Reed-Solomon Decoder Equalizer

Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

  • Upload
    buihanh

  • View
    246

  • Download
    8

Embed Size (px)

Citation preview

Page 1: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

1

ASIC SYSTEM LAB./AJOU UNIV.

Y��Ü<PM VLSI èHñt

è ¤ýô ¨Ô¨4�ý$

ÄT )l

ASIC SYSTEM LAB./AJOU UNIV.

Contents

● Digital Signal Processing● Basic Architectures for DSP Algorithms● Comparison with Microprocessors● Fixed-Point DSP Chips : DSP56100 (Motorola)● Multimedia DSP Chips

◆ MediaProcessor◆ TriMedia

● Trends of Future DSPs● VLSI Architectures for Communications

◆ Fast Fourier Transform◆ Viterbi Decoder◆ Reed-Solomon Decoder◆ Equalizer

Page 2: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

2

ASIC SYSTEM LAB./AJOU UNIV.

What is Digital Signal Processing?

● Analog Signal vs. Digital Signal◆ Analog Signal : Continuous Time and Continuous Amplitude◆ Discrete Time Signal : Discrete Time and Continuous Amplitude◆ Digital Signal : Discrete Time and Discrete Amplitude

● Advantages of Digital Signal Processing◆ Guaranteed Accuracy

à Specify Sampling Rate, Word Length and Algorithmà Independent on Time, Temperature, Humidity

◆ Low Sensitivity of Noise and Error Correctable◆ Digital system : Small, Cheaper, Less Power because of VLSI◆ Flexibility of System : Reprogrammable◆ Reliable & Predictable

● Disadvantages◆ Finite Sampling Rate & Word Length Problem◆ Wide Bandwidth for Data Transfer

ASIC SYSTEM LAB./AJOU UNIV.

Why Digital Signal Processor?

Low-passFilter

High-passFilter

Amplifier

ConvolverFourier

TransformAnalog

Systems

D/AConverter

DSPD/A

Converter

ManyAlgorithms

AnalogSignal

DigitalSignal

DigitalSignal

AnalogSignal

AnalogSignal

AnalogSignal

Digital Domain

Analog Domain

Page 3: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

3

ASIC SYSTEM LAB./AJOU UNIV.

DSP Algorithms

● Convolution

y[n] =

◆ Basic Output Sequence of LTI Digital Systems

● Correlation

y[n] =

◆ Signal Matching

● Discrete Fourier Transform (DFT)

X[k] = x[n]exp(-j2πkn / N)

◆ Spectral Analysis of Signals

h[k]x[n k]K 0

−=

n 0

N 1

=

x [n]x [n k]1 2 +=

∑n 0

N 1

ASIC SYSTEM LAB./AJOU UNIV.

DSP Algorithms (cont.)

● Z-Transform

X(z) =

◆ System and Signal Analysis

● Finite Impulse Response (FIR) Filtering

y[n] =

◆ Linear Phase and Stable Response Filtering

● Infinite Impulse Response (IIR) Filtering

y[n] =

◆ Sharper Cutoff Filtering than FIR with the Same Number of Taps

h[k]x[n - k]k 0

N 1−

x[n] Z-n

z 0=

a x[n - k] - b y[n - k]k k

k 1

M

k 0

N

∑∑

Page 4: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

4

ASIC SYSTEM LAB./AJOU UNIV.

Basic Architecture for DSP Algorithms

Inst.Memory

X DataMemory

Y DataMemory

AddressGeneration

Unit

Multiplier

Adder & Acc

Inst. bus

A=X*Y+A

X*Y

X Data bus

ProgramControl

Unit

X Y

Y Data bus

X Address bus

Y Address bus

ASIC SYSTEM LAB./AJOU UNIV.

Microprocessor System Block Diagram

Page 5: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

5

ASIC SYSTEM LAB./AJOU UNIV.

CPU Block Diagram

ASIC SYSTEM LAB./AJOU UNIV.

MCU Block Diagram

Page 6: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

6

ASIC SYSTEM LAB./AJOU UNIV.

Micro-Instruction Format

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors

● Harvard Architecture◆ X&Y Data Memories, Instruction Memory

● Multi-Bus Structure◆ Minimize Bottleneck Problem

● Three Separate Parallel Units◆ Data Calculation Unit◆ Program Control Unit◆ Address Generation Unit<Example> MAC x1, y1, A X:(R0)+, y1 X:(R3)+, x1

● On-chip Peripherals◆ A/D and D/A Converter, PLL, DMA, Host Interface,

SIO and PIO Ports, Timer, Viterbi Accelerator, etc.

Page 7: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

7

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors (cont.)

● Data Calculation Unit◆ MAC Unit: Multiply and Accumulate in a Single Inst. Cycle◆ Extended ALU and Accumulator

à Prevent Overflow and Support Multiprecision

◆ Barrel Shifterà Variable Length Shift within One Cycle

à Multi-precision and Scaling Operations

◆ Sine or Cosine ROM Table for DFT, FFT, DCT Algorithm● Program Control Unit

◆ Fast Interrupt Service for Real-time Applications◆ Multiple Level Hardware Stack for Nested Hardware Do Loop

● Address Generation Unit◆ Many Memory Address Registers

à Various Addressing Mode

à Linear, Modulo(filtering), Bit-reverse(FFT), Offset

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors(Multimedia DSPs)

● Superscalar, VLIW, SIMD, Multithreading Architectures◆ Multiple Funtional Units◆ Large and Multi-port Register Files

● Handle Various Data Types◆ Four PackedPacked Data Types

à Packed Bytes, Packed Words, Packed Double Words and PackedQuad Words

Packed Double Words

Packed Bytes

Packed Words

Packed Quad Words

063 7

15

31

63

Page 8: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

8

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors (cont.)(Multimedia DSPs)

● Load/Store Units

◆ Block Load/Store Scheme

◆ Various Addressing Modes

◆ Big- or Little-endian Addressing Modes

● Packed Operations for Group Data

ex) Packed Addition : Add two Packed Words and Clips (Saturation),

the Results to the Maximum Values if there is an Overflow

a2a1 7FFFha3

b2b1 0001hb3

A2+b2A1+b1 7FFFhA3+b3

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors (cont.)(Multimedia DSPs)

● Switching Network

◆ Deal with Mixed-Precision Data

◆ Rearrange, Expand, Pack, Merge

● Compression for MPEG-2 (Motion Estimation)

ex) SAD (Sum of Absolute Difference)

a1 a2 a3 a4

|a1-b1| |a2-b2| |a3-b3| |a4-b4|

b1 b2 b3 b4

ª

Page 9: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

9

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons with Microprocessors (cont.)(Multimedia DSPs)

● Multiple Operations in One Inst. Cycle

ex) Group-Multiply-and-Add

Multiply four Packed Bytes and Add four Packed Word

a b c d

XXXX

e f g h

++++

i j k l

a*e+i b*f+j c *g+k d*h+l

ASIC SYSTEM LAB./AJOU UNIV.

Special DSP Instructions

● Multiply and Accumulate instruction (MAC)◆ Major Operation of DSP Algorithms

● Normalization Instruction◆ Normalize Extended Value in ALU and Accumulator

● Various Arithmetic and Logical Shift Instructions◆ Multi-precision Data Operations

● Hardware Do Loop Instruction◆ Useful for Do Loop Type Algorithms

● Block Data Move Instruction◆ Use Instruction Memory in Single Data Memory DSPs

● Stand-by Inststruction◆ Low Power in Mobile Communications

Page 10: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

10

ASIC SYSTEM LAB./AJOU UNIV.

Special Multimedia DSP Instructions

● Computation◆ Partitioned add/subtract◆ Partitioned Multiply◆ Partitioned Compare◆ Group-Multiply-and-Add

● Data Format Conversions◆ Pixel expand◆ Pixel packing◆ Pixel merge

● Compression◆ Pixel distance (SAD)

src1

src2

dst1

< Partitioned Add/Subtract >

015

src1

src2

dst1

< Pixel Expand >

015

031

ASIC SYSTEM LAB./AJOU UNIV.

Commercial Fixed-Point DSP Chips

Model

Company

Data/Microcode

Inst. Set

Pipeline Dep th

MemSize

Ext.

Int.

Cache Size

BUS

Acc. Size

PGM

Data

DSP56100

AT&TMoto rola

DSP1610Tex as Instrum ents

TMS320C5x ADSP2100 OAK D950CORE uPD77017

Analog Device DSP-Group SGS-Thomson NEC

16/16 16/16 16/16 16/24 16/16 16/16 16/32

87 48 124 31 NA NA 57

3 3 4 2 3 3 3

64Kx16 16Kx24, 16Kx1664Kx16 64Kx16 64Kx16 64Kx16 64Kx16

2Kx16 512x16 9Kx16 2Kx24 xx

(12K+256)x32

6Kx16x22Kx16 , 2Kx161Kx161056x16 (Dual)8Kx16 (Dual)4Kx16 (Dual)

x 15x16 x 16x24 x x x

6 5 5 5 6 6 3

2x40 2x36 (Buf. 2x36) 32 (Bu f. 32 ) 40 4x36 2x40 x

Sh ifter1,4 ,16

Hardwired36 Barrel Shifter 16 Barrel Shifter 32 Barrel Shifter 32 Barrel Shifter 40 Barrel Shifter 40 Barrel Shifter

ALU Size 32 36 32 16x40 36 40 40

DataMemRegs

Addr

Offset

Modu lo

R0-R3 (4x16) R0-R3 (4x16) AR0-AR7 (8x16) I0-I7 (8x14)AX0-1,AY0-1

(4x16)X poin terY poin terGeneral

PurposePo int

Registers(8x16)

N0-N3 (4x16) j,k (2x16) INDX M0-M7 (8x14)DX0-3,DY0-3

(4x16)

BX,MX,BY,MY(4x16)

L0-L7 (8x14)CBSR1-2,

CBER1-2 (4x16)rb ,re (2x16)M0-M3 (4x16)

GeneralRegister Bank

(8x40)

Page 11: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

11

ASIC SYSTEM LAB./AJOU UNIV.

DSP56100 Features

● Performance : 66MIPS@15ns● Instruction/Data Width : 16/16● Multi-bus Structure (Program : 2, Data : 4 )● Pipeline Stage : Fetch, Decode, Execute● Hardware Stack Levels : 15 x 32 bit● Fast Interrupt Processing● Hardware Loop Structure

◆ LA (16 bit), LC (16 bit)● Accumulation Width : 2 x40 bit● 1, 4, 16 Hardware Shifter● Modulo, Bit-reverse Addressing● Division, Double-precision Multiplication Instruction

ASIC SYSTEM LAB./AJOU UNIV.

DSP 56100 Architecture

Page 12: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

12

ASIC SYSTEM LAB./AJOU UNIV.

Data ALU

ASIC SYSTEM LAB./AJOU UNIV.

Data ALU (cont.)

● Two 40 bit Accumulator : 2x32 bit Accumulator Registers,2x8 bit Accumulator Extension Registers

● MAC Unit◆ 16x16 Multiplier with 32 bit Product◆ Arithmetic Operation : 40 bit Result◆ Logical Operation : 16 bit Result◆ ZB Multiplexer

● Accumulator Shifter, Output Shifter● Data Shifter / Limiter : Scaling, Limiting● Data ALU Arithmetic and Rounding

◆ Fractional, Integer, Multiprecision Arithmetic support◆ Rounding : Convergent, Two’s Complement Rounding

Page 13: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

13

ASIC SYSTEM LAB./AJOU UNIV.

Program Control Unit

Program Counter

Loop Address

Loop Count

Stack Pointer

OMR SR

32 X 15

HardwareStack

16

Clock

Interrupts

Control

Address Data

Global Data Bus

16 16

ASIC SYSTEM LAB./AJOU UNIV.

Program Control Unit

● Program Address Generation● Instruction Decoding● Hardware Do Loop Control● Interrupt Control● Components

◆ Program Counter (PC)◆ Loop Address (LA) : Where to End of Loop◆ Loop Counter (LC) : Number of Iteration◆ Status Register (SR)◆ Operating Mode Register (OMR)◆ Stack Pointer (SP)◆ System Stack : Store PC and SR for Subroutine Call and

Long Interrupt

Page 14: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

14

ASIC SYSTEM LAB./AJOU UNIV.

Fast Interrupt

ASIC SYSTEM LAB./AJOU UNIV.

Address Generation Unit

Page 15: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

15

ASIC SYSTEM LAB./AJOU UNIV.

Address Generation Unit (cont.)

● Effective Address Calculation● Perform Linear, Modulo, Bit-reverse Addressing● Components

◆ Address Register File (Rn), Offset Register File (Nn),Modifier Register File (Mn), Temporary Address Registerà Where : n=0 ~ 3

◆ AGU Status Register◆ PC Relative Addressing Unit◆ Secondary Offset Adder Unit◆ Modulo Arithmetic Unit : Offset Adder, Modulo Adder,

Reverse Carry Adder

ASIC SYSTEM LAB./AJOU UNIV.

DSP56100 Instruction Set

● Number of Instructions : 87● Arithmetic Instruction : Within Data ALU

◆ Add/Sub Group : ADC, ADD, SBC, SUB, SUBL, DEC, DEC24, INC, INC24◆ Mul/Div Group : IMPY, MPY, MPYR, MPY(su,uu), DIV◆ MAC Group : MAC, MACR, DMAC, MAC(su,uu), IMAC◆ Shift Group : ASL, ASL4, ASR, ASR4, ASR16, NORM◆ Transfer Group : Tcc, TFR, TFR2, TST, TST2, SWAP◆ ABS, CLR, EXT, ZERO, etc.

● Logical Instructions◆ AND,EOR,NOT,OR,LSL,LSR,ROL,ROR◆ ANDI,ORI : Immediate Program Controller Register

● Bit Field Manipulation Instructions◆ BFCLR, BFSET, BFCHG, BFTSTL, BFTSTH

Page 16: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

16

ASIC SYSTEM LAB./AJOU UNIV.

DSP56100 Instruction Set (cont.)

● Move Instructions◆ LEA : Load Effective Address◆ MOVE, MOVE(C), MOVE(I), MOVE(M), MOVE(P), MOVE(S)

● Program Control Instructions◆ Bcc, BSR, BRA, BScc : Branch Instruction◆ Jcc, JMP, JSR, JScc : Jump Instruction◆ REP, REPcc : Repeat Instruction◆ DO, DO FOREVER, ENDDO : Loop Instruction◆ BRKcc : Conditional Exit from Hardware Loop◆ DEBUG, DEBUGcc : Debug Mode Instruction◆ RTI, RTS : Return Instruction (Interrupt, Subroutine)◆ NOP, STOP, WAIT, SWI

ASIC SYSTEM LAB./AJOU UNIV.

FIR Filter Implementation Example

Page 17: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

17

ASIC SYSTEM LAB./AJOU UNIV.

FIR Filter Segmemt

MOVE #XADDR, R0 MOVE #K-1, M0 MOVE X:INPUT, X:(R0) MOVE #CADDR, R3 MOVE #K-1, M3 CLR A X:(R0)+, y1 MOVE X:(R3)+, x1 REP #K MAC x1, y1, A X:(R0)+, y1 X:(R3)+, x1 RND A MOVE A, X:OUTPUT

ASIC SYSTEM LAB./AJOU UNIV.

SDSP 56116

● ¸dl �}Y�M 16 è\ �¹ 0¼´ DSP ● �� XM

◆ MotorolaPü DSP56116 ü )ÝX u   Üü◆ ؤM ¨øÔü ðÕ �Ü<P ³ I/O ÔIè ¼-ü8 �� $,◆ �,l ðÕ�Ü <P ³ ØÄ �¤�M ¨�D� <P◆ PM4 �Ô<�H ³ �ÔI ]Õ À�◆ VHDLè �M� Top-down IC Èh◆ COMPASSTM ³ SYNOPSYSTM �M� P Õ◆ ( ) Àô�e  �} ÀÀ

Page 18: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

18

ASIC SYSTEM LAB./AJOU UNIV.

SDSP 56116 Features

● Performance : 20MIPS@40MHz● Instruction/Data Width : 16/16● Multi-bus Structure (Program : 2, Data : 4)● Pipeline Stage : Fetch, Decode, Execute● Hardware Stack Levels : 15 x 32 bit● Fast Interrupt Processing● Hardware Loop Structure : LA (16 bit), LC (16 bit)● Accumulation Width : 2 x40 bit● 1, 4, 16 Hardware Shifter● Modulo, Bit-reverse Addressing● Division, Double-precision Multiplication Instruction● Limiting and Scaling Mode● Power Down Mode : STOP, WAIT

ASIC SYSTEM LAB./AJOU UNIV.

Chip Characteristics

● 0.8 micron VTI Standard Cell Library (v8r4)● Total Gate Count : 70,000● Package : 100 Pin MQFP● Components

◆ Data ALU (DALU)◆ Program Control Unit (PCU)◆ Address Generation Unit (AGU)◆ Dual-port RAM

Page 19: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

19

ASIC SYSTEM LAB./AJOU UNIV.

Multimedia DSPs

● Architecture Features◆ VLIW : Multiple Functional Units◆ SIMD : Partitioned Operations for Multiple Data◆ Multithread : Multiple Threads executed in Parallel◆ Vector Processor : Vectorized Operations

● Multimedia DSP◆ MediaProcessor (MicroUnity) : Multithreading◆ TriMedia (Philips) : VLIW◆ Mpact (Chromatic) : VLIW, SIMD, Vector Processor◆ TMS320C6x (Texas Instruments) : VLIW

ASIC SYSTEM LAB./AJOU UNIV.

MicroUnity MediaProcessor

MediaC o d e c

aud io

v ideo

radio

net

tape

d isc

Med iaBr idge

D R A M

PCI

Med iaChanne l

I -BUF /C a c h e

E T L B& Tags

D-BUF /C a c h e

R e g s

Br & Gate

L ,S & Sync

+/x

/ / / / / / / / / / / / /

E M a t h

Media Processor

Page 20: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

20

ASIC SYSTEM LAB./AJOU UNIV.

MicroUnity MediaProcessor (cont.)

● Five-cylinder Multithreaded Microprocessor◆ 200 MHz for Each Task◆ Five Register Files : 64 x 64 bit Registers / Register File

● Performance : 300 MHz (0.6 µm CMOS) or 1 GHz (0.5 µm BiCMOS)● Configuration

◆ Branch and Gate Execution Unit◆ Load/Store and Synchronization◆ Adder/Multiplier Unit◆ 128-bit Data Crossbar Switch◆ Extended Math Unit

● Interface◆ Media Codec : Audio, Video, Radio, Network, Tape, Disk, etc.◆ Media Bridge : PCI, DRAM◆ Media Processor

ASIC SYSTEM LAB./AJOU UNIV.

Philips TriMedia

Ñ Ö

Âñîòò Áàñ

Åôíâóèîíàë

Ôíèó °

òñâ°

òñâ±

æôàñã

Åôíâóèîíàë

Ôíèó ±

Åôíâóèîíàë

Ôíèó ±¶

Èòòôä

òëîó °

Èòòôä

òëîó ±

Èòòôä

òëîó ²

Èòòôä

òëîó ³

Èòòôä

òëîó ´

Ñäæèòóäñ åèëä §°±· ÷ ²± áèó ñäæèòóäñò¨

°´ ñäàã àíã ´ öñèóä ïîñóò

Èíòóñôâóèîí Âàâçä

­­­

● VLIW Architecture◆ Five RISC Operations per Clock at

100 MHz◆ Register File : 15 Reads/5 Writes◆ Crossbar Network◆ Instruction Coding

à Uncompressed RISC InstructionEncoding : 42 bit

● Performance : 2 to 4 BOPS@100MHz● Interface

◆ PCI Master/Slave Bridge (400 Mbps)◆ Digital Camera, Video Encoder,

Stereo Audio ADC/DAC◆ V.34 Modem Analog Front End or

ISDN Terminal

Page 21: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

21

ASIC SYSTEM LAB./AJOU UNIV.

Philips TriMedia (cont.)

● 27 Functional Units◆ 5 Constants, 5 Integer ALUs, 2 Load/Store Units, 2 Shifters, 3

Branch Units, 2 Integer/FP Multipliers, 2 FP ALUs, 1 FP Compare,1 FP Sqrt/Div, 2 DSP ALUs, 2 DSP Multipliers

◆ Number of decoders : 5à 27 Fu, Classify 5 Groups to Reduce Decoder Size

● VLIW Instruction Size◆ Uncompressed : 42 bit x 5 = 210 bit◆ Compressed : 32 bit (Huffman Coding)

● DMA Mastering-video & Audio-I/O Units (Data Prefetch) Configuration◆ Video/Audio DMA In, Out◆ VLD (Variable Length Decoder) Coprocessor◆ Image Coprocessor (MPEG-1, MPEG-2)

ASIC SYSTEM LAB./AJOU UNIV.

ؤM Multimedia ¨� �¹ 0¼´ DSP (MDSP)

Fixed-point DSP(Multimedia)

Multimedia DSP(Portability)

MultimediaMultimedia PortabilityPortability

++

MDSPMDSP(Portable(Portable

Multimedia)Multimedia)

● Ô� DSP ü ÜÀ´◆ ä��¸X DSP

à �¤, �¨É

à ؤM õM¥�t ¥ ü îð

◆ �¹ 0¼´ DSP à èLÌ¥� ä��¸X ��d<P

● ÀÀ� MDSPü ]y

◆◆ ää����¸̧XXää����¸̧XX + + Øؤ¤Øؤ¤ YY��YY�� õõMM¥�¥�õõMM¥�¥�

◆ ä��¸X ��d <PÔÄ +�¹ 0¼´ DSP ÔÄ

Page 22: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

22

ASIC SYSTEM LAB./AJOU UNIV.

MDSP Features

● ¤¤, ¤¨É �¹ 0¼´ DSP øX

●● SIMD + Vector Processing + DSPSIMD + Vector Processing + DSP● ðÕ, ðI ³ ¥e �Ü<P ¨� ¤d¥��d Ü� : 8-, 16-, 32-, 40-bit

●● HH¨̈��HH¨̈�� ÈÈ\\pp��ÈÈ\\pp�� + + Packing Packing ÈÈ\\pp��ÈÈ\\pp��

◆ LÌ¥� ��d dü

◆ EÔ �Tü ¹ø

● 4�h °�¨ � : Fetch ➞ Decode ➞ Ex1➞ Ex2

● 8Àü Nested ü��X FOR �¨ dt

● 2Àü Parallel Move

● 24 è\ l��� ø�

● �¨Ô : Barrel �¨Ô, Prescaler

°�¨ � ¬dHÔ

IDB

H¨� È\p�

VALUBashifter

VMPY

prescaler

Vadder Vadder

Pack. net

@É ¬dHÔ °

)É ¬dHÔ °

VMPY

prescaler

XDB, YDB

Execution 1

Execution 2

<DPU èHñt>

ASIC SYSTEM LAB./AJOU UNIV.

Chip Characteristics

● 5V 0.6µµm SOG (`Õ KG75)

● 0�\ ¼ : 68,831● �¤ }5  °¼ : 30 MHz● Package : 160 pin QFP● Performance

◆ DCT : MPEG2 MP@ML(720 x 480 )

: 5 frames / sec◆ Motion Estimation : MPEG1 è¸È CD

(352 x 240) 3 frames / sec◆ �Ü<P ð�P<

(P�� ¼p èô) ➞

Test Board

Døs

k33×

FFT IIR FIR AdaptiveFilter

10N

7N

6N

5N

N1/2N

3N

2N

DSP56100

MDSP

Chip

Page 23: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

23

ASIC SYSTEM LAB./AJOU UNIV.

What Should We Do for the Next Century?

● Lots of Circuits Level works◆ High Speed Clock◆ Low Power, Low Cost

● Parallel Programmable DSP Architectures◆ Employ VLIW / RISC Superscalar (RISC-SS) Architecture

à High Speed Coupled with Parallel Execution

à Good Compiler Efficiency

à Poor Code Density (VLIW) vs. Good code Density (RISC-SS)

à High Power (VLIW) vs. Low Power (RISC-SS)

à Difficult (VLIW) vs. Easy (RISC-SS) to Program by Hand

◆ High Level Languages suitable for Parallel Architectures● Architecture Driven Algorithms for Multimedia Functions● Hardware / Software Co-design Approach should be used for

Optimized Systems

ASIC SYSTEM LAB./AJOU UNIV.

Multimedia Terminal Should Have

● 2 MPEG-2 Codecs : 8 GOPS● 2 CG Generators : 4 GOPS● Stereo Echo Canceler : 4 GOPS● Background Removal : 4 GOPS

20 GOPS

● Future DSP Chips should be◆ Low Price◆ Programmable◆ 20 GOPS DSP Chip in the Year 2000

Page 24: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

24

ASIC SYSTEM LAB./AJOU UNIV.

MPU History

°¯°¯¯̄¯̄

°¯°¯¯̄

°¯°¯

°°­­¯̄

¯̄­­°°

°¯®³­´øñ

Âàâçä

²± áèó

Òôïäñòâàëäñ

µ³ áèó

îôó¬îå

îñãäñ

ÑÈÑÈÒÂÒÂ

ÃÃÒÒÏÏÔëóñàòïàñâ

Áñäàê¬óçñÁñäàê¬óçñîîôôæçæç ööîôîôëëããáäáä

ÕËÈÖ®ÌÕËÈÖ®Ìôôëóèïñîâäòëóèïñîâäòòòîñîñ

°¸°¸··¯̄ °¸°¸¸̧´́

°¯Æ°¯ÆÈÈÏÏÒÒ¿±¯¯¯¿±¯¯¯ÌÈÌÈÏÏÒÒ

ASIC SYSTEM LAB./AJOU UNIV.

Programmable DSP Chips

°¯¯°¯¯

°¯°¯

ÌÏÄÌÏÄÆƬ¬±± ÃÃäâäâ­­

ÒÈÌÃÒÈÌꪫ« ÕÕËÈËÈÖÖ«« ÑÑÈÒÂÈÒ¬¬ÒÒÒÒ

°¸·¯°¸·¯ ±¯¯¯±¯¯¯

ÌÌÎÏÎÏÒÒ

°¯¯¯°¯¯¯

°¸·³°¸·³ °¸··°¸·· °¸¸±°¸¸± °¸¸µ°¸¸µ

ÃÒÃÒÏÏ ååîñ Õîñ Õèèãäîãäî

ÃÒÃÒÏÏ ååîñ Àôãîñ Àôãèèîî ¥¥ ÒÒïääâïääâçç

ÒÈÌÃÒÈÌÃ

ÃÒÃÒÏÏ ååîñîñ ÌÌôôëëóóèèìäãìäãèàèà

ÍÄÂ

ÓÈ

ÀÓÓ

ÂëÂëîâîâêê

Page 25: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

25

ASIC SYSTEM LAB./AJOU UNIV.

● Fast Fourier Transform (FFT) Algorithm

◆ One of Discrete Fourier Transform (DFT)

◆ Reduce Computation

◆ FFT Method : Radix-2, Radix-4

● Example : Orthogonal Frequency Division Multiplexing (OFDM)

Fast Fourier Transform

IFF TTra nsmit

Filter

C hann el

FF T

÷§ ó¨

í§ ó ¨

é(2π∆å (ó+φ)

Ó÷

ãàóà

Ñ ÷

ãàóà

­­­

­­­

­­­

­­­

ÎÅÃÌ òøòóäì áëîâê ãèàæñàì

Se ria lto

Pa ra lle l

Pa ra lle lto

Se ria l

Se ria lto

Pa ra lle l

Pa ra lle lto

Se ria l

Reference : JCCI’98 pp 879~883

ASIC SYSTEM LAB./AJOU UNIV.

● Radix-2 Butterfly Algorithm

OUT0 = IN0 + IN1

OUT1 = (IN0 - IN1) WNk

● Radix-4 Butterfly Algorithm

OUT0 = [(IN0 + IN2) + (IN1 + IN3)]

OUT1 = [(IN0 - IN2) - j(IN1 - IN3)] WNk

OUT2 = [(IN0 + IN2) - (IN1 + IN3)] WN2k

OUT3 = [(IN0 - IN2) + j(IN1 - IN3)] WN3k

where WNk = e(-2�nk/N)

FFT Algorithm

°

ÖÍê

ÎÔÓ¯

ÎÔÓ°

Öͱê

ÎÔÓ±

ÖͲê

ÎÔÓ²

°°°

°

°

°

¬°

¡Ž

é

é

Ž

¬°

¬°°

Èͯ

ÈÍ°

Èͱ

ÈͲ

°

¬°

°

°

Èͯ

ÈÍ°Ö

Í

ê

ÎÔÓ¯

ÎÔÓ°

Page 26: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

26

ASIC SYSTEM LAB./AJOU UNIV.

● Butterfly Architecture

● Number of butterflies(N-point) : N/2(log2N-1)● Number of complex adders : N(log2N-1)● Number of complex multipliers : N/2(log2N-1)

Radix-2 FFT Architecture

La tch GW

PR O M

La tch E

Èì §Ï¬Ð¨

Ñä §Ï¬Ð¨

S U B

AD D

M U L 1

M U L 2

AD D

Èͯ

ÈÍ°

Îôó¯

Îôó°

ASIC SYSTEM LAB./AJOU UNIV.

Radix-4 FFT Architecture

● Butterfly Architecture

● Number of butterflies(N-point) : N/2(log4N-1)● Number of complex adders : N(log4N-1)● Number of complex multipliers : 3N/4(log4N-1)

ÂÂÒÀÒÀ

ÌÌôôëëïïëèëèääññ

ÌÌôôëëïïëèëèääññ

ÂÂÒÀÒÀ

ÌÌôôëëïïëèëèääññ

ÌÌôôëëïïëèëèääññ

ÑÑääààëë ÂîäÂîäåååå

ÑÑääààëë ÂîäÂîäåååå

ÈÈìàìàæèíæèíààñøñø

ÂÂîîääåååå

ÈÈìàìàæèíæèíààñøñø

ÂÂîîääåååå

ÈÈííïïôôóó¯̄

ÂÂËÀËÀ

ÂÂËÀËÀ

ÈÈííïïôôóó°°

ÈÈííïïôôóó±±

ÈÈííïïôôóó²²

ÎÎÔÓ°ÔÓ°

ÑäàÑäàëë ÏÏààñóñó

ÈÈìàæìàæèèííààñø Ïñø Ïààñóñó

ÈÈìàæìàæèèííààñø Ïñø Ïààñóñó

ÑäàÑäàëë ÏÏààñóñó

Page 27: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

27

ASIC SYSTEM LAB./AJOU UNIV.

Comparison between Radix-4 and Radix-2

● Algorithm comparisons

* data : Complex number

- Radix-4 reduces the number of additions and multiplicationscompared with radix-2

● Architecture comparisons- Butterfly architecture of radix-4 is more complex than that of radix-2

- However, as N increases, the gate count of radix-2 increases moresharply than that of radix-4

Íôìáäñ îå

òóàæäò

Íôìáäñ îå

Àããèóèîíò

Íôìáäñ îå

Ìôëóèïëèâàóèîíò

Ñàãè÷¬±

Ñàãè÷¬³

ëîæ±Í¬°

ëîæ³Í¬°

ͧëîæ±Í¬°¨

ͧëîæ³Í¬°¨

Í®±§ëîæ±Í¬°¨

²Í®³§ëîæ³Í¬°¨

ASIC SYSTEM LAB./AJOU UNIV.

m 1

Pu n c tu re dL o g ic

Èíïôó

Áèó

Âîãäã

Áèó

Òóñäàìò

¹ Ìîãôëî ± Àããäñ

§×ÎѨ

m 2

Gc0 = 1012 = 58Gc1 = 1112 = 78

c1

c0

1 0 1

1 1 1

MSBLSB

Convolutional Encoder for VITERBI Algorithm

● Convolutional Encoder consists of Two Components◆ Shift Register : Hold K-1 Bits (Number of Shift Register)◆ v Modulo-2 Adder : v - Bits are Output◆ Example : K = 3, r = 1/2 Convolution Encoder

Page 28: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

28

ASIC SYSTEM LAB./AJOU UNIV.

● Viterbi Decoding Procedure◆ Branch Metric Calculation (BMC)

à Calculate Hamming Distance or Euclidean Distance

◆ Path Metric Calculation (PMC)à Accumulate BM of Previous Survival Path (has smaller PM of two path)

◆ Add - Compare - Select (ACS)à Add : PM + BMà Compare : Compare Two Previous PMà Select : Select Smaller PM

◆ Trace-Back (TB)à We define the Length of TB Depthà Usually, TB Depth = K x 5 or 6à After fill TB depth, Trace Back the TB Memory and Decode the

Received Code

VITERBI Decoding

ASIC SYSTEM LAB./AJOU UNIV.

● Punctured Code : One of Modified Coding Scheme◆ Increase Code Rate◆ Decrease Coding Gain (c.f. Coding Gain is

10log(Pwithout FEC/Pwith FEC))◆ Example : r = 3/4 Punctured Convolutional Code

Ñàóä °®±

Âîíõîëôóèîíàë

Äíâîãäñ

Ñàóä ²®³

Ïôíâóôñä× Ø Ù

ç°¨ 籨 粨 糨 ç´¨ 絨×

¯§°¨ Â

¯§±¨ Â

¯§²¨ Â

¯§³¨ Â

¯§´¨ Â

¯§µ¨

°§°¨ Â

°§±¨ Â

°§²¨ Â

°§³¨ Â

°§´¨ Â

°§µ¨

Ø

Ù¯§°¨

°§°¨

¯§²¨

°§±¨

¯§³¨

°§³¨

¯§µ¨

°§´¨

Òøìáîë Ãäëäóäã §Ïôíâóôñäã¨

Ãäëäóèíæ Ìàóñè÷

Punctured Code

011

101

Page 29: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

29

ASIC SYSTEM LAB./AJOU UNIV.

● Trellis Diagram for PMC (previous BMü )◆ Example : K = 3, r = 1/2 Convolutional Code◆ Branch Metric is Hamming Distance (Hard decision, # of different bits) or

Euclidean Distance (Soft decision, difference of decimal code) betweenReceived Code and Branch Word

Trellis Diagram

¯

¯

¹ Ïàóç Ìäóñè â

¹ Á ñàí âç Ìäóñ èâ

°°

¯¯

¯°

°¯

±

°

°

¯

¯¯

°°

±

¯

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

±

¯

°

°

°

°

¯

±

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

¯

±

°

°

°

°

±

¯

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

±

¯

°

°

°

°

¯

±

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

¯

±

°

°

°

°

±

¯

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

±

¯

°

°

°

°

¯

±

¯¯

°°

°°

¯¯¯°

¯°

°¯

°¯

°

°

±

¯

¯

±

°

°

±

¯

²

²

¯

±

²

²

±

¯

°

°

±

²

±

±

°

±

²

²

°

±

²

²

²

°

°

²

³

³

Òóàóä

¯¯

°¯

¯°

°°

¹ Óñàâä¬ Á àâê

Âîññäâóäã

òèæíàë

Èíïôó òèæíàë

Âîãäã òèæíàë

Ñäâäèõäã

òèæíàë

°

°°

°°

¯

°¯

°¯

°

°¯

°¯

°

°¯

°¯

°

¯°

¯°

¯

°¯

°¯

¯

°°

°°

°

¯¯

¯°

° ° ¯ ° ° ° ¯ ¯

ASIC SYSTEM LAB./AJOU UNIV.

● Viterbi Decoder Architecture◆ Depunctured Logic : If Received Code is a Punctured Code◆ BMC : Hard/Soft Decision◆ ACS : After ACS, Storage PM Memory◆ TB : Trace-Back

P ath M etricM em ory

D epunc tu redLog ic

B ranchM etric

C a lcu la te

T race B ackM em ory

A ddC om pare

S e lect

Á̯¯Â

¯

°

÷

÷

Ãäâîãèíæ Áèó

Ñäâäèõäã

Âîãä

ÁÌ°¯

Á̯°

ÁÌ°°

VLSI Architectures for VITERBI Algorithm

If Hard decision, x is 1-bitIf Soft decision, x is 3-bit

If upper path is smaller, TB stores 0If lower path is smaller, TB stores 1

Page 30: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

30

ASIC SYSTEM LAB./AJOU UNIV.

● Serial ACS Viterbi Decoder Architecture

◆ Minimum Gate

◆ Maximum DelayPM memory

Óî

óñàâä¬áàâê

Ìäìîñø

Àããäñ

Àããäñ

Àããäñ

Àããäñ

Á̯¯

ÁÌ°°

Ï̯¯ Ï̯°

Ï̯°

ÏÌ°¯

° îñ ¯

° îñ ¯

Á̯°

ÁÌ°¯

ÏÌ°¯ ÏÌ°°

Ï̯¯

ÏÌ°°

Íä÷ó

Âëîâê

Äõäíó

¹

Compare&

Select

Compare&

Select

VLSI Architectures for VITERBI Algorithm

Reference : US patent 4,536,878

Time 0

Time 1

Time 0

Time 1

Òóàóä

¯¯

°¯

¯°

°°

Á̯°

ÁÌ°¯

ÁÌ°¯

ÁÌ°°

ÁÌ°°

Á̯¯

Á̯¯

ÏÌ°°

Ï̯°

ÏÌ°¯

Ï̯¯

Á̯°

ASIC SYSTEM LAB./AJOU UNIV.

VLSI Architectures for VITERBI Algorithm

● Parallel ACS Viterbi Decoder Architecture

◆ Minimum Delay

◆ Maximum Gate

◆ Routing Complexity High

◆ No use PM Memory

AC S

ToTrace-Back

Mem ory

FromBMC

AC S

AC S

AC S

Á̯¯

ÁÌ°°

Á̯°

ÁÌ°¯

° îñ ¯

° îñ ¯

° îñ ¯

° îñ ¯

Ï̯¯

ÏÌ°¯

Ï̯°

ÏÌ°°

Reference : US patent 4,614,933

Òóàóä

¯¯

°¯

¯°

°°

Á̯°

ÁÌ°¯

ÁÌ°¯

ÁÌ°°

ÁÌ°°

Á̯¯

Á̯¯

ÏÌ°°

Ï̯°

ÏÌ°¯

Ï̯¯

Á̯°

Page 31: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

31

ASIC SYSTEM LAB./AJOU UNIV.

D

æí¬ ê¬ °

D

æ°

D

æ¯

Ê òøìáîëò

RS Encoder

● LFSR (Linear Feedback Shift Register)� �Õ● (n, k) RS Encoder

◆ n : # of code symbol◆ k : # of message symbol◆ gi : �Õ���ü h¼

ASIC SYSTEM LAB./AJOU UNIV.

RS Decoding

● Decoding Procedure◆ Syndrome (error pattern) hT

à S1, S2, yyy, S2t

◆ Error Locator Polynomial (t� ¨ü ���)◆ Error Location hT

à t� ¨ü ���ü  è ��

◆ Error ¶ 8¹à ���, t� ¨ü ���ü h¼ ³  è �M

◆ Error ¹¹à ¼�À ø� p� xor t� ¶ => t�¹¹

Page 32: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

32

ASIC SYSTEM LAB./AJOU UNIV.

RS Decoder

Òøíãñîìä

�o

�« Ã�

£,��« Ã�

�« Ñ

�o

ÅÈÅÎ §Ãäëàø Áôååäñò¨

Èíïôó

Ãäâîãèíæ

Îôóïôó

Äññîñ Èíãèâàóèîí

R eg .

�×

- Syndrome hT 0�ä t� ¨ü hT 0�t PM- uР eÐ �T Lô ¤I

XOR, Finite Field Multiplier,Shift Register � �Õ

ASIC SYSTEM LAB./AJOU UNIV.

t� ¨ü ��� hT

● Berlekamp-Massey Algorithm◆ Shift Registers, XORs, Control Logic, Finite Field Multiplier, etc.

S h i ft R e g i s te r (2 t)

S h i ft R e g i s te r (2 t)

S h i ft R e g i s te r (2 t)

S h i ft R e g i s te r (2 t)

÷

x

+

C o n tro l U n i t

M U X

+

xC o n s ta n t

M U X

M U X

General Berlekamp-Massey Architecture

- Shift Register : 8t (t8 t�¼¹IÉ)

- Finite Field Multiplier : 2

- Finite Field Divider : 1

- Control Logic, MUXs

- Small Gate Count, Low Speed

< Hardware Complexity >

Page 33: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

33

ASIC SYSTEM LAB./AJOU UNIV.

t� ¨ü ��� hT

● Euclid Algorithm◆ Registers, Finite Field Multiplier, etc.

R EG±

Ì

Ô

× R EG

Ì

Ô

×

R EG

R EG

Ì

Ô

×

±

Ì

Ô

×

- Register : 4 x t (t8 t�¼¹IÉ)

- Finite Field Multiplier : 2 x t

- MUXs, XORs

- Large Gate Count, High Speed

< Hardware Complexity >

“Reed-Solomon Euclid Algorithm Decoder Having aProcess Configurable Euclid Stack,”U. S. Patent 5,170,399, Dec. 8, 1992.

ASIC SYSTEM LAB./AJOU UNIV.

ÀÀ� FEC P5

● FEC ü ]y◆ èÔè/RS/�µ L� Ä� ÔI◆ �1 �Tè ¨� uÐ �� ³ °�¨ � ��

● èÔè ¸øø◆ K=7, r=1/2� ü �ü �øø◆ 3-bit Soft Decision/Hard Decision Ä�ÔI◆ BER Monitoring ÔI◆ Serial/Parallel )É Ä�ÔI

● Reed-Solomon ¸øø◆ (200,188), 6�  ¹¹ IÉ◆ 4¤d ¤d ø�Ì dt : (200, 188), (120, 108), (60, 48), (40, 28)◆ Ô� RS èHñtü ü��X �Ô ! 16.7% ¨(

Page 34: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

34

ASIC SYSTEM LAB./AJOU UNIV.

P ÕÀ FEC ü $ ¸�

ÀÀ� FEC ü ]y

● FEC ü ]y◆ Gate Count : ! 75,000 (èÔè : ! 45,000, RS : ! 30,000)◆ }5  °¼ : 33 MHz (worst case)

ASIC SYSTEM LAB./AJOU UNIV.

Channel Model

● ISI(Intersymbol Interference)

◆ Band-limited Channel Distortion (Wired Channel)

◆ Multipath Fading (Wireless Channel)

● Equalizer - è´t ü� h�  ¨e �� �eü8 Discrete Time Filter

y(n) : Equalizer Output

w(n) : Tap Coefficient

¯ Ó ±Ó ²Ó¬Ó¬±Ó¬²Ó

Èê

ó

PSF Channel Equalizer

0 T

PSF : Pulse Shaping Filter

∑−

=

−=1

0

)()()(M

k

k knxnwny

Page 35: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

35

ASIC SYSTEM LAB./AJOU UNIV.

Criteria of Equalizer

● Frequency Bandwidth◆ Baseband, Passband

● Sampling Time ◆ Symbol - One Sample/symbol◆ Fractional Symbol - Two or More Samples/symbol

● Coefficient Characteristics◆ Fixed, Adaptive

● Architecture◆ Transversal, DFE, Lattice

ASIC SYSTEM LAB./AJOU UNIV.

Transversal Structure

● Simplest Type

● Small Gate Count, Low Speed

● Low Power Consumption

T T T TInput

Xn

OutputYn

C0 C1 C2 Cn-2 Cn-1

∑−

=−=

1

0

)()()(M

k

k knxnCny

T : Sample Time

Ck : Tap Coefficient

Register : N

Multiplier : N

Adder : N-1

N : Number of Taps

Page 36: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

36

ASIC SYSTEM LAB./AJOU UNIV.

Decision-Feedback Structure

● Good Performance in the Presence of Severe ISI● Moderate Gate Count, Power Consumption● Low Speed

jk

K

j

jjk

Kj

j ICXCI −

=

−=∑∑ +=

~2

1

0

1

^

T T T

­ ­­ ­ ­ ­ ­ ­­ ­ ­ ­

DecisionDevice

ÓÓ ÓÓ ÓÓ

Error

DecisionData

Training Data

Input Data

­ ­­ ­ ­ ­ ­ ­­ ­ ­ ­

FeedforwordFilter

FeedbackFilter

{Xk}

{I^k}

{I~k}

C-k1 C0C-k1+1

C1Ck2

Register : N

Multiplier : N

Adder : N-1

N : Number of Taps

ASIC SYSTEM LAB./AJOU UNIV.

Lattice Structure

● High Power Consumption

● Large Gate Count, High Speed

fm(n) = fm-1 (n) + k*mb m-1(n-1)

bm(n) = bm-1 (n-1) + kmf m-1(n)

y(n)ä Transversal  ½ä ¼�

K1

K1*

T T T

K2*

K2 KM-1

K*M-1

InputXn

Stage1 Stage2 StageM-1

f0(n) f1(n) f2(n) fM-1(n)

bM-1(n)b2(n)b1(n)b0(n)

Register : N

Multiplier : 2N

Adder : 2N + 1

N : Number of Taps

Page 37: Y Ü P M VLSI è Hñ€¦ · Multi-Bus Structure ... TMS320C5x ADSP2100 OAK D950CORE uPD77017 Analog Device DSP-Group SGS-Thomson NEC 16/16 16/16 16/16 16/24 16/16 16/16 16/32

37

ASIC SYSTEM LAB./AJOU UNIV.

Comparisons of Tap Update Algorithms

ZF LMS RLS

HardwareComplexity Low Low High

Speed Medium Low High

PowerConsumption

Low Low High

Error CorrectionCapability Low Medium High

Tap Update ð�P< èô

ASIC SYSTEM LAB./AJOU UNIV.

WLAN Modem Chip

8 9 10 11 12 13 1410-8

10-7

10-6

10-5

10-4

10-3

10-2

8 9 10 11 12 13 14

10-8

10-7

10-6

10-5

10-4

10-3

10-2

Th.DQPSK HSP3824 Proposed

BER

Eb/No

● Wireless LAN Modem (IEEE 802.11)◆ Data Rate : 4Mbps(DQPSK), 2Mbps(DBPSK)

BER v.s. Eb/No : 6.5e-6 @ 14dB (AWGN)◆ Differential Encoder/Decoder, Spreader/Despreader, Matched

Filter, CRC Encoder/Decoder, Scrambler/Descrambler, SymbolTiming Recover, Carrier Frequency Offset Recover