28
借助SDSoC快速開發 複雜的嵌入式應用 May 2017

Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

借助SDSoC快速開發複雜的嵌入式應用

May 2017

Page 2: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

SoC application-like programming

Tools and IP for system-level

profiling

Rapid system performance

estimation by fast SW/HW

partitioning and implementation

Full system optimizing compiler

What Is

C/C++ Development

System-level Profiling

Specify C/C++ Functions

for Acceleration

Full System Optimizing Compiler

ARM Codemain()

Connectivity

GCC Vivado

Acceleratorfunc()

Page 2

Page 3: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

What Is

C/C++ Development

System-level Profiling

Specify C/C++ Functions

for Acceleration

Full System Optimizing Compiler

ARM Codemain()

Connectivity

GCC Vivado

Acceleratorfunc()

“With SDSoC, I was able to complete

a full Zynq design in 4 days.

Then, I did the same design

without SDSoC and took me 3 weeks

with the same QoR.”

Daniele B, Xilinx DSP Specialist for EMEA

21 4Days Days

Page 3

Page 4: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Typical SoC Development Flow and Challenges

APP(){

funcA();

funcB();

funcC();}

HW-SW partition?

funcA funcB, funcC

HW-SW Connectivity?

funcA funcB,

funcC

Datamover

PS-PL interfaces

SW drivers

Processing System (PS) Programmable Logic (PL)

• Hard to know final system bottle neck during early system design

System Team

• Manually translating algorithms to HDL takes long time

• Manually implementing data mover network takes time

Hardware Team

• Write drivers for register control interface

• Write and debug device driver of DMA

Software Team

Page 4

Page 5: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Before SDSoC: HW-SW Partition Exploration

PL

PS

ApplicationSDKC/C++

DriverSDK, OS ToolsC

IP IntegratorIPI projectDatamover

PS-PL interface

IPVivadoHLS

Verilog, VHDL

HW-SW partition

spec

Met

Req

?

Involving multiple discipline to explore architecturePage 5

Page 6: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

After SDSoC: Automatic System Generation

C/C++

Select functions

for PL

PL

PS

IP

Application

Driver

Datamover

PS-PL interface

Met

Req

?

C/C++ to System in hours, days

func1();<-SW

func2();<-HW

func3();<-HW

Page 6

Page 7: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

SDSoC: Makes Everyone More Productive

• Explore HW-SW partitions and architecture rapidly in C/C++

System Team

• No need to translate algorithms to HDL

• No need to implement data mover network

• Build IO system into a re-usable platform

Hardware Team

• Now I can accelerate my code in HW

• No need to write device driver

Software Team

Page 7

Page 8: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Find Optimal Solution with Fast Iterations

System Team

C/C++

func1();<-SW

func2();<-SW

func3();<-SW

1 2 3SW

HW

One Click to toggle SW and HW implementation

One line #pragma to change data port and data mover type

C/C++

func1();<-SW

func2();<-HW

func3();<-HW

1

2 3

ACP, DMA

C/C++

func1();<-SW

#pragma SDS

func2();<-HW

func3();<-HW

1

2 3

HP, SGDMA

Page 8

Page 9: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Look into Details by Event Trace Tool

System Team

Enable by one click during compile time

Differentiate among SW, Data Mover and Accelerator IP

Page 9

Page 10: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Automatic HW & SW Connection

Software Team

Hardware Team

Auto generate register access device driver

Auto generate data motion DMA driver

Easy to integrate RTL based IP into SDSoC using C-

callable IP

PL

PS

HDL IP

Application

Driver

Datamover

PS-PL interface

Page 10

Page 11: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Architecture-aware Algorithm Implementation

Hardware Team

Algorithm spec in C/C++

Functional specification

Translate to RTL

Optimize RTL

Algorithm spec in C/C++

Optimize C/C++

Traditional Flow

SDSoC Flow

Manual RTL translation is not required

Algorithm and HW team work together to optimize IP

Optimize RTL

Page 11

Page 12: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Easy to Create Base Platform for Custom Boards

Application

Driver

Interface

IPs

Interface

IPs

Application

Driver

AXI Bus

Platform

Processing Systems (PS)

Programmable Logic (PL)

Platform = Vivado project + Bootable software images

– HW: define AXI interface and interrupt in Vivado project in tcl

– SW: define boot images (FSBL, kernel, rootfs, etc) in GUI tool

Software Team

Hardware Team

Page 12

Page 13: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Easy to Create Base Platform

Platform = Vivado project + Bootable software images

SDSoC generated IP can hook to AXI in base platform

Software Team

Hardware Team

Application

Driver

Interface

IPs

Interface

IPs

Application

Driver

AXI Bus

Platform

Application

Driver

IP IP IP IP

Connectivity

Generated

Page 13

Page 14: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Examples and Use Cases

Page 14

Page 15: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

main(){

imread(A);

imread(B);

denseOpticalFlowPyrltr(A,B,out);

imshow(out);}

MIPI

AXIPS

PL

Linux

Libraries

Application

Drivers

denseOptical

FlowPyrltrHDMI

Xilinx ZU9

Frames/s 60

Power (W) 4.8

Latency (ms) 16.7

Utlization 15%

• Benchmarks do not include the camera inputs and HDMI/DP

• LK dense optical flow, non-pyramidal, non-iterative, Window size 53x53

DMA

AXI-S

ZCU102 EV

Platform

SDSoC

Generated

Platform

DMA

AXI-S

Page 15

Computer Vision Design Example4K60 Dense Optical Flow

Page 16: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

main(){

imread(A);

imread(B);

stereoRectify(A,B,C,D);

stereoLBM(C,D,out);

imshow(out);}

USB3

AXIPS

PL

SDSoC

Generated

Platform

Stereo

LBM

DMA

AXI-S

Stereo

Rectify

Linux

Libraries

Application

Drivers

Xilinx ZU9

Frames/s 140

Power (W) 4.8

Latency (ms) 7.1

Utlization 14%

• SAD based stereo localBM

• Benchmarks do not include the camera inputs and HDMI/DP outputs

HDMI

ZCU102 EV

Platform

Page 16

Computer Vision Design ExampleStereo Disparity Map

Page 17: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Zynq UltraScale

Target Device ZU4EV or ZU5EV

< 6W total power

Linux based OS

H.264 HP, H.265 encoder

Use Case: Surveillance

4K60

H.265

DDR3/4

CNN (Face Detect)CMOS

Sensor

VCU

PS

ISPLVDS

DDR3/4

Overlay

ENET

DP

CNN (Face Detect)

Page 17

Page 18: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Use Case: Machine Vision

Target Device ZU2 or ZU3

Linux based OS

Zynq UltraScale

4K60

DDR3/4

CMOS

Sensor

PS

ISPLVDS

DDR3/4

USB3

1G Eth

GTHFeature

Extract

Color

Inspect

GigE

Coax GTH

Page 18

Page 19: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Use Case: Drone/UAV

Target Device ZU4EV, ZU5EV or ZU7EV

Linux based OS

Zynq UltraScale

2K60

DDR3/4

CMOS

Sensor

ISP

MIPI

DDR3/4

PS

Radio

DPOptical

ADC

2K60CMOS

Sensor

4K60CMOS

Sensor

4K60CMOS

Sensor

MIPI

MIPI

MIPI

CNN

Stereo

Vision

VCUH.265

Page 19

Page 20: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Use Case: AR

Target Device ZU4EV, ZU5EV or ZU7EV

Linux based OS

Zynq UltraScale

DDR3/4

ISP

Gyro/

Accer

DDR3/4

PS DPPosition

tracking

2K60CMOS

Sensor

4K60CMOS

Sensor

4K60CMOS

Sensor

MIPI

MIPI

MIPI

CNN (Eye

tracking)

Stereo

Vision

VCU

H.265

WiFi

Gesture

Page 20

Page 21: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Embedded Vision Development Kits

Base Zynq Board ZCU102 ZCU106 ZC702 ZC706

Device ZU9 (16nm) ZU7 (16nm) Z7020 (28nm) Z7045 (28nm)

CPU Quad Cortex A53 up to 1.5GHz Dual Cortex A9 up to 1.0GHz

Peak GOPS @ INT8 7857 5386 571 2331

On-chip RAM (Mbits) 32.1 38.0 4.9 19.1

Inputs USB3, MIPI, HDMI USB3, MIPI, HDMI HDMI* HDMI*

Outputs HDMI, DisplayPort HDMI, DisplayPort HDMI HDMI

Video Codec Units No 4K60 Encode/Decode No No

reVISION Support xFopencv, xFdnn xFopencv, xFdnn xFopencv, xFdnn xFopencv, xFdnn

Sensor Inputs Sony IMX274 Quad OnSemi AR0231 StereoLab Zed Stereo eCon camera

Spec 3840x2160 @ 60 FPS 1920x1080 @ 30 FPS 3840x1080 @ 30 FPS 1920x1080 @ 60 FPS

Interface MIPI via FMC MIPI via FMC USB3 USB3

* Requires an HDMI IO FMC card

Page 21

Page 22: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

SDSoC 2016.4 Enhancements

Zynq Ultrascale+ MPSoC enhancements

– Full support to zcu102 platform, with additional clocks

– Increased high performance HP port data width to 128-bits

– Support for custom Zynq Ultrascale+ MPSoC platforms

System compiler enhancements

– Support of packed structs and scalar widths up to 1024-bits (was 32-bits)

– Support for HLS dataflow and multi-buffered BRAM-mapped array

arguments at the function top level

– Support for SG-DMA on MIG accessible DDR

HW/SW event trace support for async hardware functions

Page 22

Page 23: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Application Development

Algorithm Development

Platform Development

CNNGoogLeNet

SSD

FCN …

DNN

Page 23

Page 24: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Algorithm

to RTL

Bitstream

Generation

OpenCV

Apps

ML Apps

Traditional RTL flow

System

Integration

Ease of Use

OpenCVMachine Learning

Tim

e o

n A

pp

sT

ime

on

Pla

tfo

rm

Removing the Barrier to Broad Adoption

Page 24

Page 25: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

SDSoC Resources

Access SDSoC Portal for learning materials

– http://www.xilinx.com/sdsoc

Page 25

Page 26: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

SDSoC Basic Usage

– UG1028: SDSoC Tuturial

– UG1127: SDSoC Environment User Guide

– Video: SDSoC Development Environment Demo

Custom Platform Creation

– UG1146: Platform Development Guide

– Video: SDSoC Custom Platform Generation

C-Callable IP

– UG1127 Chapter 5

SDSoC Optimization Guide

– UG1235

Documents

Page 26

Page 27: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Conclusion: Faster Time-to-Market using SDSoC

Page 27

• Explore HW-SW partitions and architecture rapidly in C/C++

System Team

• Build IO system into a re-usable platform for software team

• Help software team to optimize C code rather than translate C into RTL manually

Hardware Team

• Focus on algorithm function and performance

Software Team

Page 28: Creating Alternate Title Designs借助SDSoC快速開發 複雜的嵌入式應用 May 2017

© Copyright 2017 Xilinx.

Page 28