Upload
gerard-ball
View
228
Download
2
Embed Size (px)
Citation preview
Lab3-1 張明峰 交大資工系
Lab 3: FPGA Implementation
Specification
RTL design andSimulation
Logic Synthesis
Gate Level Simulation
ASIC Layout FPGA Implementation
Lab3-2 張明峰 交大資工系
Why Top-Down?
• Design of complex systems• Reduce time-to-market
– shorten the design verification loop– focus on functionality
• Easier and cheaper to explore different design option
Lab3-3 張明峰 交大資工系
RTL Design
• Characteristics– fully clock driven RTL code with some behavioral constructs– contain complete functional description– cycle accurate
• Coding style– structural description (component connections/net-list)– data flow description (continuous assignment)– RTL description (always block)
• combinational RTL
• sequential RTL
Lab3-4 張明峰 交大資工系
Logic Synthesis
• Translate synthesizable RTL code to gate-level design
Always @(posedge clk) begin if(sel1) begin if(sel2) out = in1 ; else out = in2 ; else if(sel3) if(sel4) out = in3 ; else out = in4 ;endendmodule
Gate-level circuits
Lab3-5 張明峰 交大資工系
Structural Mapping
Lab3-6 張明峰 交大資工系
Resource Sharing
• Exampleif (op_code ==0)
r = a + c ;
elser = a + b ;
• Sharing– a single ALU for the two additions– a MUX for the second input of the ALU
• No-Sharing– two adders for the two additions– an output MUX to select the output
Lab3-7 張明峰 交大資工系
Register Inferencing
• Determines which signals must be preserved across cycle boundaries– incomplete logic specification (missing branches)– explicit register instantiation
• always @(posedge clk)
– signal used before assigned
Lab3-8 張明峰 交大資工系
Two-level Logic Optimization
• AND-OR representations– easy implementation as PLAs and PLDs– a key optimization technique– efficient algorithms and heuristics exist– in commercial use for several years– minimize the number of product terms
• Example– F = XYZ + XY’Z’ + XY’Z + X’YZ + XYZ– F = XY’ + YZ
Lab3-9 張明峰 交大資工系
Multi-Level Logic Optimization
• Meet performance or area constraints through restructuring and simplifications– two-level minimization– common factor extraction– common expression re-substitution
• Trade-off between area and delay• In commercial use for several years
– f1 = abcd+abce+ab’cd’+ab’c’d+a’c+cdf+abc’d’e’+ab’c’df’
– f2 = bdg + b’dfg + b’d’g+bd’eg
– f1 = c(a’+x)+ac’x’
– f2 = gx
– x = d(b+f) + d’(b’+e)
Lab3-10 張明峰 交大資工系
Transformation Examples
• Algebraic Factoring F = + B + ABC + AC G = 16
– Factoring:
F = ( + B ) + A (BC + C ) G = 16
– Factoring again:
F = ( B + ) + AC (B + ) G = 12
– Factoring again:
F = ( + AC) (B + ) G = 10
DCAA
A
A
C
C
D
CC D
D
D
A C D D
Lab3-11 張明峰 交大資工系
Transformation Examples
• Decomposition– The terms B + and + AC can be defined
as new functions E and H respectively, decomposing F:
F = E H, E = B + , and H = + AC G = 10• This series of transformations has reduced G
from 16 to 10, a substantial savings. The resulting circuit has three levels plus input inverters.
ACD
D AC
Lab3-12 張明峰 交大資工系
Transformation Examples
• Substitution of E into F– Returning to F just before the final factoring step:
F = ( B + ) + AC (B + ) G = 12– Defining E = B + , and substituting in F:
F = E + ACE G = 10– This substitution has resulted in the same cost as
the decomposition
A C DD
A C
D
Lab3-13 張明峰 交大資工系
Transformation Examples• Elimination
– Beginning with a new set of functions:
X = B + C
Y = A + B
Z = X + C Y G = 10 – Eliminating X and Y from Z:
Z = (B + C) + C (A + B) G = 10– “Flattening” (Converting to SOP expression):
Z = B + C + AC + BC G = 12– This has increased the cost, but has provided an new
SOP expression for two-level optimization.
A
A
A
A
Lab3-14 張明峰 交大資工系
Transformation Examples
• Two-level Optimization– The result of 2-level optimization is:
Z = B + C G = 4
• This example illustrates that:– Optimization can begin with any set of equations, not
just with minterms or a truth table
– Increasing gate input count G temporarily during a series of transformations can result in a final solution with a smaller G
A
Lab3-15 張明峰 交大資工系
Transformation Examples• Extraction
– Beginning with two functions:
E = + BD
H = C + BCD G = 16 – Finding a common factor and defining it as a function:
F = + BD– We perform extraction by expressing E and H as the
three functions:
F = + BD, E = F, H = CF G = 10– The reduced cost G results from the sharing of logic
between the two output functions
BA A
A
B
B
D
D
D
BD
Lab3-16 張明峰 交大資工系
Technology Mapping
• Translation of a technology independent representation of a circuit into a circuit in a given technology with optimal cost
• Optimization criteria– minimum area– minimum delay– meeting specified timing constraints– meeting specified timing constraints with minimum area
• Usages– Technology mapping after technology independent logic
optimization
Lab3-17 張明峰 交大資工系
Sample covers
Lab3-18 張明峰 交大資工系
State Machine Synthesis
• Translate state table or graph– state minimization– state assignment to minimize the cost function
• Challenges– state machine decomposition– state assignment for performance– state assignment for testability– extract state graph from implementation
Lab3-19 張明峰 交大資工系
Spartan II Features Plentiful logic and memory resources
– 15K to 200K system gates (up to 5,292 logic cells)
– Up to 57 Kb block RAM storage Flexible I/O interfaces
– From 86 to 284 I/Os
– 16 signal standards Advanced 0.25/0.22um 6-Layer Metal Process High performance
– System frequency as high as 200 MHz Advanced Clock Control with 4 Dedicated DLLs Unlimited Re-programmability Fully PCI Compliant
Lab3-20 張明峰 交大資工系
Spartan-II Top-level Architecture
• Configurable logic blocks– Implement logic here!
• I/O blocks– Communicate with other
chips
– Choose from 16 signal standards
• Block RAM– On-chip memory for higher
performance
Lab3-21 張明峰 交大資工系
Spartan-II Top-level Architecture
• Clocks and delay locked loops– Synchronize to clock on and
off chip
• Rich interconnect resources – Three-state internal buses
• Power down mode– Lower quiescent power
Lab3-22 張明峰 交大資工系
CLB Slice (Simplified)• 1 CLB holds 2 slices• Each slice contains two sets
of the following:– Four-input LUT
• Any 4-input logic function• Or 16-bit x 1 RAM• Or 16-bit shift register
Lab3-23 張明峰 交大資工系
CLB Slice (cont’d)
• Each slice contains two sets of the following:– Carry & control
• Fast arithmetic logic• Multiplier logic• Multiplexer logic
– Storage element• Latch or flip-flop• Set and reset• True or inverted inputs• Sync. or async. control
Lab3-24 張明峰 交大資工系
CLB
MUXF6
Slice
LUT
LUTMUXF5
Slice
LUT
LUTMUXF5
Dedicated Expansion Multiplexers• MUXF5 combines 2 LUTs to
form– 4x1 multiplexer
– Or any 5-input function
• MUXF6 combines 2 slices to form– 8x1 multiplexer
– Or any 6-input function
Lab3-25 張明峰 交大資工系
I/O Block (Simplified)• Registered input, output, 3-state control• Programmable slew rate, pull-up, pull-down, keeper
and input delay
Lab3-26 張明峰 交大資工系
I/O Interface Standards• I/O can be programmed for 16 different signal
standards– VCCO controls maximum output swing– VREF sets input, output, three-state control
• Different banks can support different standards at the same time– Logic level translation– Boards with mixed standards
Lab3-27 張明峰 交大資工系
IOBs Organized As Independent Banks
• As many as eight banks on a device– Package dependent
• Each bank can be assigned any of the 16 signal standards
• XC2S50– GCK 0: pin 80
– GCK 1: pin 77
– GCK 2: pin 182
– GCK 3: pin 185
Lab3-28 張明峰 交大資工系
2ns
2ns
2ns2ns
CLB Array
High Performance Routing• Hierarchical routing
– Singles, hexes, longs
• Sparse connections on longer interconnects for high speed
• Routing delay depends primarily on distance– Direction independent
– Device-size independent
• Predictable for early design analysis
Lab3-29 張明峰 交大資工系
Power-down Mode
• Controlled by single power down pin• All inputs blocked, appear low internally• All outputs disabled• All register states preserved• Power-down status pin• Synchronous wake up• 100 uA typical
Lab3-30 張明峰 交大資工系
Mode
Config.Data
Format
Direction ofSynchronizing
Clock UseSlaveSerial
Serial FPGA receivesCCLK
Processor or CPLD or another FPGA ( in Mastermode) controls configuration of slave FPGA
Also for configuring multiple slave FPGAs in adaisy chain (2ND, 3RD FPGA, etc.).
MasterSerial
Serial FPGA generatesCCLK
FPGA in Master mode configures itself from aserial PROM.
Also, 1st FPGA (master) in daisy chain controlsconfiguration of slave FPGA(s) in a daisy chain.
SlaveParallel
Byte FPGA receivesCCLK
Processor or CPLD controls the fast configuration ofslave FPGA.
JTAG Serial FPGA receivesTCK
Make use of existing boundary scan port
There are four ways to program a Spartan-II FPGA
Configuration Modes
Lab3-31 張明峰 交大資工系
Device XC2S15 XC2S30 XC2S50 XC2S100 XC2S150 XC2S200
Logic Cells 432 972 1728 2700 3888 5292
Block RAM Bits 16,384 24,576 32,768 40,960 49,152 57,344
Block RAM Qty. 4 6 8 10 12 14
Max. User I/Os 86 132 176 196 260 284
Package VQ100 VQ100
CS144 CS144
TQ144 TQ144 TQ144 TQ144
PQ208 PQ208 PQ208 PQ208 PQ208
FG256 FG256 FG256 FG256
FG456 FG456 FG456
Spartan-II Family Overview
Lab3-32 張明峰 交大資工系
Spartan-II Architecture Summary Delivers all the key requirements for ASIC replacement
– 200,000 gates
– 200 MHz
– Flexible I/O interfaces
– On-chip distributed and block RAM
– Clock management
– Low power
– Complete development system support
Lab3-33 張明峰 交大資工系
Xilinx ISE 8
• Integrated Software Environment
Lab3-34 張明峰 交大資工系
Foundation Project Manager• Integrates all tools into one environment
Lab3-35 張明峰 交大資工系
Schematic Entry
Lab3-36 張明峰 交大資工系
State Machine Graphical Editor
Graphical editor synthesizes into ABEL or VHDL code
Lab3-37 張明峰 交大資工系
Simulation - Easy to Use and Learn
• Generate stimulus easily and quickly
– Keyboard toggling– Simple clock stimulus– Custom formulas
• Easy debugging– Waveform viewer– Signals easily added and
removed– Simulator access from
schematic– Color-coded values on
schematic• Script Editor
Lab3-38 張明峰 交大資工系
What is Implementation?
• More than just “Place & Route”• Implementation includes many phases
– Translate: Merge multiple design files into a single netlist– Map: Group logical symbols from the netlist (gates) into
physical components (CLBs and IOBs)– Place & Route: Place components onto the chip, connect
them, and extract timing data into reports– Timing (Sim): Generate a back-annotated netlist for timing
simulation tools– Configure: Generate a bitstream for device configuration
Lab3-39 張明峰 交大資工系
Terminology
• Project– Source file; has a defined working directory and family
• Version– A Xilinx netlist translation of the schematic– Multiple Versions result from iterative schematic changes
• Revision– An implementation of a Xilinx netlist– Multiple revisions typically result from different options
• Part type– Specified at translation; can be changed in a new revision
Lab3-40 張明峰 交大資工系
Starting the Flow Engine
Foundation Project Manager
Lab3-41 張明峰 交大資工系
LP-2900-XC2S50PQ208
Lab3-42 張明峰 交大資工系
FPGA – XC2S50
Lab3-43 張明峰 交大資工系
Data Switches
Lab3-44 張明峰 交大資工系
7-segment & LED
Lab3-45 張明峰 交大資工系
Keyboard
Lab3-46 張明峰 交大資工系
8x8 LED
Lab3-47 張明峰 交大資工系
8051
Lab3-48 張明峰 交大資工系
Lab 3: 7-Segment Display & LED
• Input: two 4-bit numbers
num1,num2
( push buttons sw1~sw8 ) • Show the number in 7-Segment
Display(active high)• Compare num1 and num2 , Use LED to show the result
( LED4 = 1 when num1 > num2;
LED6 = 1 when num1 == num2;
LED8 = 1 when num1 < num2 )
Lab3-49 張明峰 交大資工系
7485
7-segdec.
7-segdec.
SW[1:4]
SW[5:8]
{agb,alb,aeb}=3’b001agboalboaebo
Lab3-50 張明峰 交大資工系
Example
Lab3-51 張明峰 交大資工系
Lab3-52 張明峰 交大資工系