22
© 2011 ANSYS, Inc. May 27, 2016 1 ANSYS CFD的高速運算介紹 李龍育 Dragon CFD技術經理 虎門科技

CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 1

ANSYS CFD的高速運算介紹

李龍育 Dragon

CFD技術經理

虎門科技

Page 2: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 2 Taiwan Auto-Design Co.

虎門科技股份有限公司,創立於

1980年,提供客戶全球最優質的

工程分析軟體ANSYS與技術服務

• 結構強度分析

ANSYS Mechanical

• 落摔分析

ANSYS LS-DYNA

• 散熱與熱流場分析

ANSYS FLUENT、 ICEPAK、CFX

• 電磁場分析

ANSYS Emag、 Maxwell

• 多物理耦合分析

Provider of Engineering Solutions and Methodology

• 總公司 : 新北市板橋區

• 台中分公司

• 台南分公司

虎門科技 CADMEN

Page 3: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 3

Wave Front Co., Ltd.

Applications

• 稀薄氣體流場分析

• CVD, Plasma

• 磁控濺鍍 (Sputter)

• 蒸鍍 (Evaporate)

• 乾式蝕刻

Software

• DSMC-Neutrals

• Particle-Plus

Page 4: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 4

Semiconductor

TADC: Successful Stories

Electronics

Equipments

Green Energy

Chemical Engineering

Page 5: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 5

Fluid Dynamics Structural Mechanics

ANSYS Simplorer

ANSYS Engineering Knowledge Manager

ANSYS HPC ANSYS Workbench

Electromagnetics

ANSYS DesignXplorer

Systems and Multiphysics

ANSYS FLUENT

ANSYS CFX

ANSYS Icepak

ANSYS HFSS

ANSYS Maxwell

ANSYS Q3D

ANSYS Mechanical

ANSYS LS-DYNA

ANSYS nCode

ANSYS Acoustics

About ANSYS Advanced Physics Solvers

Page 6: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 6

Comprehensive Multiphysics

Structural- Mechanics

Fluid Dynamics

Electromagnetics

Systems and Embedded Software

ANSYS Vision for the Future from the Beginning • Predict complex product performance under real world conditions • Simulate complete virtual prototypes

Aero-Vibro-Acoustics-Coupling

Page 7: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 7

• 馬達在現今使用上的需求,追求高效能的同時,對於考量散熱、結構強度、震動與噪音等的議題逐漸升高,然而各物理量之間卻存在著相互影響之關係,共同的求解不同場域問題,將能更全面的評估馬達的設計。

• ANSYS 提供了優良的電機設計和分析能力:

– 電磁性能

– 電氣驅動性能

– 結構分析

– 熱流分析

– 聲學分析

• ANSYS 耦合技術可以將

電磁力轉換至多物理場域分析

馬達分析的需求

線圈溫度分佈 振動與噪音

Core loss

Page 8: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 8

Why HPC Is Important?

Faster

• Reduce turn around time

• Consider more design variants

Larger

• Assess larger, more detailed models

• Consider more complex physics

• From single component to system simulations

Easier

• Put powerful computation resources at users’ fingertips

• Efficient decisions earlier in the product development cycle 2504

4944

10053

19073

31476

28 56 112 224 448

Time steps done in one day with 16.0.0 oil_rig_7m Intel Haswell

Page 9: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 9

Improved Solver Performance & Scaling

Improved parallel scalability:

• Also observed for more typical model sizes at more regular core counts!

• At extreme core counts o 86% efficiency for 830M cell case at 36K cores,

with species transport

o 80% efficiency for 91M cell case at 16K cores, with complex physics

Page 10: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 10

Case Details:

• Wave loading on Oil Rig

• Number of cells: 7 Million

• Cell Type: Mixed

• Models used: SST K-omega turbulence

• Solver: Pressure based segregated, VOF, Green-Gauss cell based, unsteady

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

0 128 256 384 512 640 768

Rat

ing

Number of Cores

15.0.7

16.0.0

Improved Parallel Performance & Scaling

Page 11: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 11

Case Details:

• External flow over aircraft landing gear

• Number of cells: 15 Million

• Cell Type: Mixed

• Models used: LES + Acoustics

• Solver: Pressure based coupled, Least Square cell based, Unsteady

0

500

1000

1500

2000

2500

3000

3500

4000

0 512 1024 1536 2048

Rat

ing

NumCores

15.0.7

16.0.0

Improved Parallel Performance & Scaling

Page 12: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 12

Application Example

HPC 計算效能範例- VOF噴嘴暫態流場運算

• 噴嘴流場分析

• 網格數量: 13.6M

• 使用模組: VOF

• 暫態穩定後,計算10個疊代,平均一個疊代所需時間

虎門伺服器計算時間

網格數 13.6M ←

計算核心數 1 30

CPU時脈 2.6GHz ←

記憶體容量 256GB ←

記憶體時脈 1600MHz ←

計算一個time step時間(分) 192.1 4.6

500 time step 時間(天) 66.7 1.6

Page 13: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 14

4 Main Products

HPC (per-process)

HPC Pack • HPC product rewarding volume parallel processing for

high-fidelity simulations

• Each simulation consumes one or more Packs

• Parallel enabled increases quickly with added Packs

HPC Workgroup • HPC product rewards volume parallel processing for

increased simulation throughput shared among engineers throughout a single location or the world

• 16 to 32768 parallel shared across any number of simulations on a single server

HPC Parametric Pack • Enables simultaneous execution of multiple design points

while consuming just one set of licenses

2048

32

8

128

512

Parallel Enabled (Cores)

HPC Packs per Simulation

1 2 3 4 5

32768

8192

6 7

Page 14: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 15

Deliver Outstanding Parallel Performance & Scaling at an Ever Increasing Scale of Parallelism

● A continuous software development focus on HPC enabling parallel improvements made release by release - also at R17.0.

● ANSYS solvers are highly optimized to run fast and deliver outstanding parallel scaling at an increasing scale of parallelism!

● As we are committed to taking simulations to new levels of software scalability, we are having strong technology partnerships with hardware vendors, and supercomputer centres (e.g. NCSA, HLRS).

ANSYS Features & Capabilities

Customer Benefits

● As HPC evolves into the future, ANSYS is the right choice to sustain the software investment that is required to stay ahead.

● And, to ensure that your ROI in HPC resources is maximized – now and into the future!

● Reduced time to solution of your current models by leveraging more cores.

● Be less constrained by hardware limitations because ‘bigger’ models can be sped up at your existing compute capacity.

Supercomputing Milestone

Page 15: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 16

Improved Parallel Performance & Scaling

ANSYS Fluent 17.0

ANSYS Application Example

Big speed-ups for moving dynamic mesh due to: • Neighborhood optimization • Sliding interface optimization • Parallel solver optimization • Combustion code refactoring

In-Cylinder Combustion Model:

• 55% faster at 384 cores

• 7 cell zones, MDM, Spray, Partially premixed, 1.6 million cells

Page 16: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 18

GPU-accelerated ANSYS products

Fluent® Mechanical Nexxim

HFSS

TM

TM

TM

Page 17: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 19

CPU + GPU

AN

SY

S F

luent

Tim

e (

Sec)

AMG solver time

5.9

x

2.5

x Lower is

Better

Solution time

GPU Acceleration of Water Jacket Analysis

• Unsteady RANS model

• Fluid: water

• Internal flow

• CPU: Intel Xeon E5-2680; 8 cores

• GPU: 2 X Tesla K40

Water jacket model

ANSYS Fluent 15.0 performance on pressure-based coupled Solver

NOTE: Times

for 20 time steps CPU only CPU + GPU CPU only

4557

775

6391

2520

Page 18: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 20

GPU Scaling on 111M Aerodynamic Problem

• 111M mixed cells

• External aerodynamics

• Steady, k-e turbulence

• Double-precision solver

• CPU: Intel Xeon E5-2667; 12 cores per node

• GPU: Tesla K40, 4 per node

Truck Body Model

144 CPU cores – Amg

48 GPUs – AmgX

AMG solver time per iteration (secs)

29

11

Fluent solution time per iteration (secs)

36

18

144 CPU cores

144 CPU cores + 48 GPUs

2.7 X

2 X

Lower is

Better

Note: AmgX is a GPU solver developed by NVIDIA and is implemented by ANSYS in Fluent for accelerating CFD

Better performance on problems with relatively high %AMG solver time

80% AMG solver time

Page 19: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 21

GPU Performance: Problem Size

• External aerodynamics

• Steady, k-e turbulence

• Double-precision solver

• CPU: Intel Xeon E5-2667; 12 cores per node

• GPU: Tesla K40, 4 per node

Truck Body Model

14 million cells

13

9.5

111 million cells

36

18

144 CPU cores

144 CPU cores + 48 GPUs

1.4 X

2 X

Lower is

Better

36 CPU cores

36 CPU cores + 12 GPUs

AN

SY

S F

luent

Tim

e (

Sec)

Better speed-ups on larger and harder-to-solve problems

NOTE: Reported

times are per

iteration

Page 20: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

Page 22

Parametric Solutions, Inc. Proprietary

ANSYS FLUENT Accelerated with NVIDIA GPU

ANSYS 15.0

1.8M cells

8 Cores (Xeon E5-2687)

64 GB Ram

NVIDIA K40 GPU

Solution Time: GPU : ~ 1 hr

No GPU: ~ 1.75 hrs

GPU Produces a 43% Reduction in Solution Time

Page 21: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 23

Particle Plus HPC Performance • The parallel performance about PIC computation with Particle-PLUS. It

shows execution time and speed up for 1, 2, 4, 8, 16 and 24 processes,

"speed up“ where means computation speed, relative to single computation.

• The graph is plotted using the following data,

• number_of_processor execution_time[hour] speed_up

• 1 76.1500 1.00

• 2 39.4872 1.93

• 4 21.6081 3.52

• 8 11.7622 6.47

• 16 6.40496 11.89

• 24 6.03144 12.63

•where I measure execution times under the following condition: • •- number of super particles (effective) : about 5E7 •- electron density : about 5E16 •- sampling time steps : 5000 •- number of grids : 7200 •- no outputs of results •- execution on unix PC cluster

Note that parallel performance depends greatly on number of super particles which are used effectively. If the number is not many, then parallel performance becomes bad. For example, when the number of super particles is about 5E5, the speed up at 24-parallel process is 3.1 times as single process.

Page 22: CGNS Export to Virtual.Lab Examplecom.cadmen.com.tw/.../download/GPU.pdf · 2016. 5. 30. · ANSYS Fluent 17.0 ANSYS Application Example Big speed-ups for moving dynamic mesh due

© 2011 ANSYS, Inc. May 27, 2016 24

Fluids

Thermal

Emag CAD Import

Structural

Post- process

Meshing Workflow

Design Points

Thank you for your

attention!

EnSight