Upload
alaina-dean
View
215
Download
0
Embed Size (px)
Citation preview
Digital Signal Processors (DSPs)
DSP• Advanced signal processor circuits• MAC (Multiply and Accumulate) unit (s) - provides
fast multiplication of two operands and accumulating results at a single address.
• MAC unit computes fast an expression such as the following summation. yn = Σ (ai × xn−i) where the sum is made for i = 0, 1, 2, …, N-1. Here i, n and N are the integers, ai is a coefficient, xj is independent variable or an input element and yk is the dependent variable or an output element.
DSP• DSP processors invariably have Harward
architecture.• Caches are organized in Harward architecture
separate I-cache and D-Cache• Basic Units: Memory, Internal Bus, Data bus,
Address bus, control bus, Bus Interface Unit, Instruction fetch register, Instruction decoder, Control unit, Instruction Cache, Data Cache, multistage pipeline processing, multi-line superscalar processing for obtaining processing speed higher than one instruction per clock cycle, Program counter
DSP special structural units• Packed Data processing• Parallel Execution MAC units• Special Instructions• Instruction Packing unit
ADVANCED PROCESSOR ARCHITECTURESAND MEMORY ORGANISATION
SHARC and Tiger SHARC
SHARC• Processor from Analog Devices.• SHARC stands for Super Harvard Single - Chip
Computer Super Harvard architecture means more than one set of address and data buses for data
• For example, set J of address and data buses for data-memory space, set K of address and data buses for data-memory space, and set I of address and data buses for program memory space
SHARC functions• Program memory configurable as program and
data memory parts (Princeton architecture)• SHARC functions as VLIW (very large instruction
word) processor.• used in large number of DSP applications.• Controlled power dissipation in floating point
ALU.• Different SHARCs can link by serial
communication between them
SHARC features• Instruction word 48-bits• 32-bit data word for integer and floating point
operations • 40-bit extended floating point.• Smaller 16 or 8 bit must also store as full 32-bit data.
Therefore, the big endian or little endian data alignment is not considered during processing with carry also extended for rotating (RRX).
• ON chip memory 1 MB Program memory and data memory (Harvard architecture)
• External OFF chip memory• OFF chip as well as ON-chip Memory can be
configured for 32-bit or 48 bit words.
Parallel Processing features• Supports processing instruction level parallelism
as well as memory access parallelism.• Multiple data accesses in a single Instruction
Addressing Features• SHARC 32-bit address space for accesses • Accesses 16 GB or 20 GB or 24 GB as per the word
size ( 32-bit, 40-bit or 48 bit) configured in the memory for each address.
• When word size = 32-bit then external memory configuration addressable space is 16 GB (232 × 4) bytes
Word size features• Provides for two word sizes 32-bit and 48-bit.• SHARC two full sets of 16 general-purpose
registers, therefore fast context switching• Allows multitasking OS and multithreading.• Registers are called R0 to R15 or F0 to F15
depending upon integer operation configuration or floating point configuration.
• Registers are of 32-bit. A few registers are special, of 48 bits that may also be accesses as pair of 16-bit and 32-bit registers.
TigerSHARC• Highest performance density family of processors
from Analog Devices• Precision high-performance integrated circuits
used in analog and digital signal processing applications
• Designed for multiprocessing applications and for peak performance greater than BFLOPS (billion floating-point operations per second)
Example
• ADSP-TS203SABP-050 processor processes using 250 MHz clock and on chip memory of 6 M bits and operates at 1.2V/3.3 V
• Low voltage design helps in processing with little power dissipation.
• Analog Devices claims the highest performance per Watt.
TigerSHARC• Multiple TigerSHARCs can connect by serial
communication at 1 GBps.• A TigerSHARC version has 24M bits ON-chip
memory.• Two ALUs and twos set of address and data buses
for data memory• On set of address and data buses for program
memory• TigerSHARC is available as IP core also so that new
applications and enhancements can be developed.
ArithmaticFn = Fx + FyFn = Fx-FyFn = ABS(Fx + Fy)Fn = ABS(Fx-Fy)Fn=(Fx + Fy)/2COMP(Fx,Fy)Fn = -FxRn = Rx + Ry + CI
AddSubtractAbsolute value of sumAbsolute value of differenceAverageCompareNegateAdd with Carry
ArithmaticFn = ABSFxFn = PASS FxFn = RND FxFn = SCALE Fx by RyRn = MANX FxRn = LOGB FxRn = FIX Fx,
Fn = FLOAT Rx by Ry
Absolute valueCopyFx to FnRoundScale exponent of Fx by RyExtract mantissa of FxConvert exponent of Fx to integerConvert floating-point to integerConvert integer to floating-point
ArithmaticFn = RECIPS FxFn = RSQRTS Fx
Fn = Fx COPYSIGN FyFn = MIN(Fx.Fy)Fn = MAX(Fx,Fy)Fn = CLIP Fx by Fy
Create seed for reciprocalCreate seed for reciprocal square rootCopy sign of Fy to FxMinimum of Fx, FyMaximum of Fx, FyClip Fx within range [-Fy,Fy]
LogicalRn = LSHIFT Rx by RyRn = Rn OR LSHIFT Rx by RyRn=ASHIFT Rx by RyRn = Rn OR ASHIFT Rx by RyRn = ROT Rx by RyRn = BCLR Rx by RyRn = BSET Rx by RyRn = BTGL Rx by Ry
Logical shift distance RyLogical shift and logical ORArithmetic shiftArithmetic shift and logical ORRotate distance RyClear one bit in RxSet one bit in RxToggle one bit in Rx
LogicalBTST Rx by RyRn = FDEP Rx by RyRn = Rn OR FDEP Rx by RyRn = FDEP Rx by RyRn = Rn OR FDEP Rx by RyRn = FEXT Rx by RyRn = FEXT Rx by RyRn = EXP Rx
Test one bit in RxDeposit field from Rx into RnDeposit field from Rx using ORDeposit and sign extend field from RxDeposit and sign extend using ORExtract field from RxExtract and sign extend field from RxExtract exponent field
Operations
Rn = EXP Rx (EX)Rn = LEFTZ RxRn = LEFTO RxRn = FPACK Fx
Fx = FUNPACK Rn
Extract exponent field from ALUExtract number of leading OsExtract number of leading IsConvert 32-bit floating-point to 16-bit floating-pointConvert 16-bit floating-point to 32-bit floating-point
Load Value• R0 = DM (0x2000000); • R0 = DM(_a);
loads R0 the contents of the variable a• DM(_a) = R0;
stores R0 into memory location
Little Endian and Big Endian in a Memory OrganizationLittle Endian: The LSB Byte (8-bit) of a word of 32 bit is
written into the lowest address byte, and the other bytes are written in increasing order of significance.
Word (32 bit): 0x90ABCDEF Word Address: 0x1000Address0x1000 0x1001 0x1002 0x1003Little EF CD AB 90
Big Endian: The byte order is revered. The MSB Byte (8-bit) of a word of 32 bit is written into the lowest address byte, and the other bytes are written in decreasing order of significance.
Word (32 bit): 0x90ABCDEF Word Address: 0x1000Address0x1000 0x1001 0x1002 0x1003Big 90 AB CD EF
Princeton Architecture• 80x86 processors and ARM7 have Princeton
architecture for main memory. 8051-family microcontrollers have Harvard architecture.). Vectors and pointers, variables, program segments and memory blocks for data and stacks have different addresses in the program in Princeton memory architecture.
Harvard architecture• When the address spaces for the data and for
program are distinct• Handling streams of data that are required to be
accessed in cases of single instruction multiple data type instructions and DSP instructions.
• Separate data buses ensure simultaneous accesses for instructions and data.
• Program segments and memory blocks for data and stacks have separate set of addresses in Harvard architecture.
• Control signals and read-write instructions are also read-write instructions are also separate for accessing the program memory and data memory.