37
Urychlení návrhu a optimalizace teplotních senzorů s využitím HPC technologií Tomáš Kozubek IT4Innovatins, VŠB-TUO Konference e-infrastruktury CESNET, 29-30. ledna 2019

Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Urychlení návrhu a optimalizace teplotních senzorů s využitím HPC

technologiíTomáš Kozubek

IT4Innovatins, VŠB-TUO

Konference e-infrastruktury CESNET, 29-30. ledna 2019

Page 2: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Contract research 2018

New R&D centre in Ostrava

https://www.it4i.cz/infoservis/publikace

Page 3: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

• Electric fields• Electromagnetism• Heat transfer

• heat generated by magnetism• cooling system

• Structural Mechanics• structural integrity• vibration from motion• high speed motors• influenced by electromagnetism

• Active cooling system• fluid flow

• Acoustic• generated by fluid flow• generated by electromagnetism• generated by vibrations

Complex nonlinear multiphysical and multiscale problem – electric motor

Motivation for Exascale - Digital Twin Technology

Page 4: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Full transient CFD simulation of the active cooling systemSimulation results• Complex velocity and pressure fields

• Ventilation looses• Cooling efficiency

One simulation• 35 hours / 1200 Cores on Salomon• “Standard” powerful workstation • 256GB RAM 30 Cores 5 ~2 months!

What we need for optimization?• Geometry morphing• different number of ribs• different fan shape• boundary conditions variation• …

Lead’s to 1000’s simulations One cluster isn't enough

Multiphysics simulations

Page 5: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

ULTRASONIC OIL SENSOR

• Creating complex model, computational mesh, boundary condition definition, material models – pre-processing

• Solution by numerical methods• Results analysis – post-processing

1. Open source: OpenFOAM, Code Saturne, SU2, Elmer, Dune, …https://www.cfd-online.com/Wiki/Codes#Free_codes

2. Commercial codes: ANSYS CFD, FLUENT, CFX, StarCCM+, COMSOL, …https://www.cfd-online.com/Wiki/Codes#Commercial_codes

3. Or try to write it in your own wayMatlab, R, Octave, Python, C, C++, Fortran, …

Software for engineering simulations

Page 6: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Scalable algorithms development

#9

Page 7: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

ESPRESO parallel solver

Mesh Processing

Matrix AssemblerFEM/BEM (BEM4I)

TFETI/Hybrid TFETI Solvers

ANSYS / OpenFOAM

ESPRESO Generator

ESPRESO API

Visualization

ParaviewCatalyst

CPU

Preprocessing PostprocessingESPRESO C++ library

MIC GPU

EnSight / Visit

MPI, cilk++(or openMP), MKL, METIS, PARDISO(in MKL) or PARDISO SC for MIC/GPU versionOptional: SCOTCH, MUMPS, CuSolver, VTK, HYPRE interface

Page 8: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Load steps definition for combination of multiple steady-state and time dependent analyses

Transient solvers• Generalized trapezoidal rule • Automatic time stepping based

on response frequency approach

Nonlinear solvers • Newton Raphson – full and symmetric • Newton Raphson with constant tangent matrices• Line search damping• Sub-steps definition• Adaptive precision control for iterative solvers

Linear and quadratic finite element discretization

Gluing nonmatching grids by mortar discretization techniques

Full-fledged material models• nonlinear materials• isotropic, orthotropic and anisotropic material models• materials for phase change

Element coordinate system definition – cartesian, polar and spherical

Temperature and time dependent boundary conditions• linear convection• nonlinear convection • heat flow• heat flux• diffuse radiation• heat source• translation motion

Consistent SUPG and CAU stabilization for Translation Motion (advection), Inconsistent stabilization

Phase Change based on apparent heat capacity method

Boundary element discretization for selected physical applications

Highly parallel multilevel FETI domain decomposition based solver for billions of unknowns for symmetric and non-symmetric systems with accelerators support and combination of MPI and OpenMP techniques

Asynchronous parallel I/O

Input mesh format from popular open source and commercial packages like OpenFOAM, ELMER or ANSYS

Output to commonly used post-processing formats, VTK and EnSight

Monitoring results on selected regions for statistic and optimization toolchain

Simple text Espreso Configuration File (ecf) for setting all ESPRESO FEM solver parameters without GUI. Control each parameter in ecf file from command line

Heat Transfer Module Capability List:

ESPRESO FEM Highly parallel FEM package for engineering simulations

Page 9: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Response time optimization of the USL sensor - nonlinear transient simulation

170

340

680

1360

96 192 384 768 1536

TOTA

L SI

MU

LATI

ON

TIM

E [S

EC]

NUMBER OF CORES

FETI for transient problem without projectorFETI for transient problem with optimized Conjugate projector

Heat transfer

Page 10: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Elasticity BVP Terms on contact boundary

Unilateral contact (non-penetration)

Friction (Tresca or Coulomb)

Structural mechanics

Page 11: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Parallelization by DDM/FETI

Page 12: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Fproblem

FETI

FETI-DP

TFETI

1.

2.

3.

subdomains are fixed or free but with different defects

FETI-DP (partial splitting, nonsingular)

all subdomains are free with the same defect

F

F

F

F

F

F

FETI approaches

Page 13: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Fproblem

BETI approach

F

FTFETI/TBETI

Page 14: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

1 2 3 4

Hybrid Total FETI Method

1 2 3 4cluster 1 cluster 2

FEM discretization

FETI method

Hybrid FETI method

Page 15: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Hybrid Total FETI Method

Page 16: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

1 2 3 4

In Hybrid Total FETI method the dimension of GGT is smaller (dim=6) compared to Total FETI (dim=12) due to additional constraints which remove local rigid body modes.

1 2 3 4

Hybrid Total FETI Method

Page 17: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

augmented KKT system by the matrix 𝐁𝐁c and λc

reordering according to clusters

additional constraints:duplication of ‘corners’ Lagrange multiplicators

Implementation

Hybrid Total FETI Method

Page 18: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

additional constraints (red arrows)

The matrix 𝐁𝐁c is a copy of specific rows from the matrix 𝐁𝐁 corresponding to components of 𝛌𝛌 acting on the corners between subdomains 1,2, and 3,4, respectively (red arrows).

1 2 3 4cluster 1 cluster 2

Implementation

Hybrid Total FETI Method

Page 19: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

relabeling

dualization

reordering according to clustersModified KKT system is solvable by the same family of algorithms like FETI (Preconditioned Conjugate Projected Gradient method, …).

Size of �G�GT (or defect of global stiffness matrix) can be significantlysmaller than in original FETI approach ...

Hybrid Total FETI Method

Page 20: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Pipelining

Page 21: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Bi-Conjugate Gradients Stabilized (BiCGStab)

Page 22: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Pipelined BiCGStab (p-BiCGStab)

Page 23: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Pipelined BiCGStab vs. Improved BiCGStab

Page 24: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Robustness and attainable accuracy: p-BiCGStab-rr

Page 25: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

0.02

0.015

0.01

0.005

00 200 400 600 800 1000 1200 1400 1600 1800

sing

le it

erat

ion

time

[s]

Number of subdomains

MPI parallelization

Anselm with Intel MPI

Optimization of global communication for FETI Solvers

ESPRESO FETI

FETI – CG FETI – pipelinedCG

FETI – pipelinedCG with GGTINV

Page 26: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

250

200

150

100

50

0#1 #8 #27 #64 #125

# - number of nodes used for solution

Prep

roce

ssin

g ru

ntim

e [s

]

FETI processing

Hybrid FETI processing+ FETI processing

Tested on Salomon Supercomputer

Hybrid FETI vs. FETI preprocessing

ESPRESO FETI

Page 27: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Accelerators

Page 28: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •
Page 29: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

ESPRESO CG in FETIPre-processing – K factorization

1.) 𝑥𝑥 = 𝐵𝐵1𝑇𝑇 � 𝜆𝜆 - SpMV2.) 𝑦𝑦 = 𝐾𝐾−1 � 𝑥𝑥 - solve 3.) 𝜆𝜆 = 𝐵𝐵1 � 𝑦𝑦 - SpMV4.) stencil data exchange in 𝜆𝜆

- MPI – Send and Recv- OpenMP – shared mem. Vec

Pre-processing - 𝑆𝑆𝑐𝑐 = 𝐵𝐵1𝐾𝐾−1𝐵𝐵1𝑇𝑇1.) - nop2.) 𝜆𝜆 = 𝑆𝑆𝑐𝑐 � 𝜆𝜆 - DGEMV, DSYMV3.) - nop4.) stencil data exchange in 𝜆𝜆

- MPI – Send and Recv- OpenMP – shared mem. Vec

Pre-processing - 𝑆𝑆𝑐𝑐 = 𝐵𝐵1𝐾𝐾−1𝐵𝐵1𝑇𝑇 → GPU/MIC1.) 𝜆𝜆 → GPU/MIC - PCIe transfer from CPU2.) 𝜆𝜆 = 𝑆𝑆𝑐𝑐 � 𝜆𝜆 - DGEMV, DSYMV on

GPU/MIC3.) 𝜆𝜆 ← GPU/MIC - PCIe transfer to CPU4.) stencil data exchange in 𝜆𝜆

- MPI – Send and Recv- OpenMP – shared mem. vec

90 – 95% of runtime spent in Api

Projected Conjugate Gradient in FETI

Page 30: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

0.3 - 300 million DOF Hybrid FETI CG Solver Runtime Linear elasticity

GPU acceleration of the ESPRESO Solver

ORNL Titan 5rd in TOP500 LIST

02468

10121416

0 200 400 600 800 1000

Solv

erru

ntim

e [s

]

Number of compute nodes / GPUs

CPU GPU - general storage format

speedUp3.4

•Current work: • support for cuDense and

cuSparse solvers from CUDA toolkit

• memory optimization for symmetric local schurcomplements

• implementation of memory efficient methods

•Future work: • Support for Power8/9

platforms with Pascal/Volta GPUs with NVLink

one K20x vs. one AMD Opteron 6274 16-core CPU

Page 31: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Parallel pre-processing

Page 32: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

ESPRESO FEM – pre-processing

Page 33: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Efficient parallel loading of seq. stored unstructured mesh databasesDirect connection with commercial pre-processing toolsHighly parallel I/O of external sequentially stored datawith domain decomposition techniques

Jet Enginefile size 142 GBNodes 822 MElements 484 Mseq. processing 7200spar. Processing 13.69 s

Page 34: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

GUI and Solver as a Service

Page 35: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

POLIMI, IMEC, ICCS,UCY, IT4I, THALES, HENESYS

Floods Traffic

Pollution Mobility

Online Solver as a Service (HPC as a Service )

Page 36: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

HPC as a Service (HEAppE) Overview

• Providing HPC capabilities as a service to client applications and their users• Unified interface for different operating systems and schedulers• Authentication and authorization to provided functions• Monitoring and reporting of executed jobs and their progress• Current information about the state of the cluster

Page 37: Urychlení návrhu a optimalizace teplotních senzorů s ... · Temperature and time dependent boundary conditions • linear convection • nonlinear convection • heat flow •

Thank you for attention