Upload
kento-aoyama
View
36
Download
0
Embed Size (px)
Citation preview
Evaluation of Container Virtualized MEGADOCK System
in Distributed Computing Environment
March 23th, 2017SIG BIO 49@Japan Advanced Institute of Science and Technology
Kento Aoyama1,2, Yuki Yamamoto1,2, Masahito Ohue1,3, Yutaka Akiyama1,2,3
1) Department of Computer Science, School of ComputingTokyo Institute of Technology
2) Education Academy of Computational Life Sciences (ACLS)Tokyo Institute of Technology
3) Advanced Computational Drug Discovery Unit, Institute of Innovative ResearchTokyo Institute of Technology
“Docker” 2
https://www.docker.com/what-container
No. of pulled containers from DockerHub
Docker and Bioinformatics 3
A. Paolo, D. Tommaso, A. B. Ramirez, E. Palumbo, C. Notredame, and D.
Gruber, “Benchmark Report : Univa Grid Engine , Nextflow , and Docker
for running Genomic Analysis Workflows.”
Docker Integration Benchmark Report
@Centre for Genomic Regulation
(Barcelona, Spain)
• Univa Grid Engine (Job Scheduler)
• Nextflow (Workflow manager)
• Docker (Linux Container)
• Reproducibility
• Portability
To develop the Container-Native HPC Bioinformatics Application
Using Linux Container
which has …
• Low Dependency on Environment
• High-Performance• Parallel execution performance
• Overhead of virtualization
• Dynamically Scaling
Research Purpose 4
• To evaluate the Performance of Docker Container-Virtualizationin Bioinformatics Application
Target Application
• MEGADOCK[1]
• FFT-grid-based Protein-Protein Docking software
• Multi-threading, Multi-node, Multi-GPU (OpenMP, MPI, GPU)
• Extremely compute intensive workloads
Today’s Report 5
[1] Masahito Ohue, et al. “MEGADOCK 4.0: an ultra-high-performance protein-protein docking
software for heterogeneous supercomputers”, Bioinformatics, 30(22): 3281-3283, 2014.
BackgroundLinux Container
Docker
Container & Bioinformatics
6
Kernel-Shared Virtualization
• Lightweight : small size, fast deploy, easy sharing
• Performance : few virtualization overhead, faster than VM
Linux Container 7
Hardware
Linux Kernel
Container
App
Bins/Libs
Container
App
Bins/Libs
Hardware
Virtual
Machine
App
Guest
OS
Bins/Libs
Virtual
Machine
App
Guest
OS
Bins/Libs
Hypervisor
Virtual Machines Containers
Linux Container
• virtualizes the host resource as containers• Filesystem, hostname, IPC, PID, Network, User, etc.
• can be used like Virtual Machines
Linux Kernel Features
• Containers are sharing same host kernel
• namespace[1], chroot, cgroup, SELinux, etc.
Container-based Virtualization 8
[1] E. W. Biederman. “Multiple instances of the global Linux namespaces.”,
In Proceedings of the 2006 Ottawa Linux Symposium, 2006.
Machine
Linux Kernel Space
Container
Process
Process
Container
Process
Process
Linux Container – Performance [1] 9
[1] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual
machines and Linux containers,” IEEE International Symposium on Performance Analysis of Systems and
Software, pp.171-172, 2015. (IBM Research Report, RC25482 (AUS1407-001), 2014.)
0.96 1.00 0.98
0.78 0.83
0.99
0.82
0.98
0.00
0.20
0.40
0.60
0.80
1.00
PXZ [MB/s] Linpack [GFLOPS] Random Access [GUPS]
Perf
orm
an
ce R
ati
o
[base
d N
ati
ve]
Native Docker KVM KVM-tuned
Docker [1]
• Most popular Linux Container management platform
• Many useful components and services
Linux Container Management Tools 10
[1] Solomon Hykes and others. “What is Docker?” - https://www.docker.com/what-docker
[2] W. Bhimji, S. Canon, D. Jacobsen, L. Gerhardt, M. Mustafa, and J. Porter, “Shifter : Containers for
HPC,” Cray User Group, pp. 1–12, 2016.
[3] “Singularity” - http://singularity.lbl.gov/
[1]
[2] [3]
Easy container sharing – Docker Hub 11
Portability & Reproducibility• Easy to share the application environment via Docker Hub
• Containers can be executed on other host machine
Ubuntu
Docker Engine
Container
App
Bins/Libs
Image
App
Bins/Libs
Docker Hub
Image
App
Bins/Libs
Push Pull
Dockerfile
apt-get install …
wget …
…
make
CentOS
Docker Engine
Container
App
Bins/Libs
Image
App
Bins/Libs
Generate
Share
AUFS (Advanced multi layered unification filesystem) [1]
• Docker default filesystem as AUFS
• Layers can be reused in other container image
• AUFS helps software Reproducibility
Docker - Filesystem 12
[1] Advanced multi layered unification filesystem. http://aufs.sourceforge.net, 2014.
Docker Container (image)
f49eec89601e 129.5 MB ubuntu:16.04 (base image)
366a03547595 39.85 MB
ef122501292c 133.6 MB
e50c89716342 660.4 KB
tag: beta
tag: version-1.0
tag: version-1.0.2
tag: version-1.25aec9aa5462c 24.17 MB
tag: latest0d3cccd04bdb 6.07 MB
Why in the field of Bioinformatics?
• Types of Applications• Data Analysis, Machine Learning
• MD Simulation, Docking calc. , etc.
• Data-centric workload• Compute : Large
• Data I/O : Case by case
• Communication : Small
• Container performs well on compute-Intensive workload[1]
For Bioinformatics Apps : 1 13
[1] W. Felter, et al. “An updated performance comparison of virtual
machines and Linux containers,” IEEE International Symposium on
Performance Analysis of Systems and Software, pp.171-172, 2015.
Reproducibility• Different version of library can make different result
• e.g.) Genomic analysis pipeline [Paolo, 2016]
Container A’
Container A
Container BContainer A
For Bioinformatics Apps : 2 14
Library A
Application A Application B
version >= 1.2 version < 1.1
Application A
Library version 1.3Result A’
Application A
Library version 1.2Result A
conflict
different
result
Dependency
Isolation
Application
Reproducibility
Dependency conflict• Different application can requires different version of same library
Performance• Few performance overhead
Reproducibility• Dependency Isolation from other applications/libraries
Portability, Generality• Sharing/Porting to other environment
Features for Bioinformatics Apps 15
Features Native VM Container
Performance
ScalabilityGreat Bad Good
Reproducibility Bad Good Great
Portability
GeneralityBad Great Great
Proposed Method
16
MEGADOCK 17
Masahito Ohue, et al. “MEGADOCK 4.0: an ultra-high-
performance protein-protein docking software for
heterogeneous supercomputers”, Bioinformatics,
30(22): 3281-3283, 2014.
High-performance protein-protein interaction predictions
• FFT-grid based docking software
• Extremely compute-intensive
• OpenMP/MPI/GPU support
• Great HPC Performance
Container-based Application Distribution 18
ResourceResource
MEGA
DOCK
Resource
MEGA
DOCK
Add/Remove
Container
Resource
MEGA
DOCK
Add/Remove
Application
Layer
Compute
Resource
Layer
• All application dependencies exist in the Container• Easy-to-test application
• Easy-to-scale size of resources
Test Environment Production Environment
Experiments
19
Experiment IEvaluate container virtualization overhead on Physical Machine
• Physical Machine (single-node) + Docker
• Physical Machine (single-node, GPU) + NVIDIA-Docker
Experiment IIEvaluate container virtualization overhead on Cloud Environment
• Virtual Machines (multi-node) + Docker
• Virtual Machines (multi-node, GPU) + NVIDIA-Docker
Experiments 20
Measurement
• megadock-gpu exec. time
• time command (6 times, median)
Dataset
• 100 pair-pdb (KEGG pathway)
Options (OpenMP, OpenMPI)
• MPI : 12 threads / 4 MPI process / 1 node
• GPU : 1 GPU / 1 process / 1 node
Overview of Experiment I 21
Physical Machine
MPI
MPI
MPI
MPI
Physical Machine
Docker
MPI
MPI
MPI
MPI
Physical Machine
GPU
MEGADOCK
GPU
Physical Machine
NVIDIA Docker
MEGADOCK
GPU
GPU
(b)(a)
(d)(c)
Test Case Native Docker
CPU (MPI) (a) (b)
GPU (c) (d)
Hardware/Software Specification 22
Software Env. Physical Machine Docker NVIDIA Docker (GPU)
OS (image) CentOS 7.2.1511 ubuntu:14.04 nvidia/cuda8.0-devel
Linux Kernel 3.10.0 3.10.0 3.10.0
GCC 4.8.5 4.8.4 4.8.4
FFTW 3.3.5 3.3.5 3.3.5
OpenMPI 1.10.0 1.6.5 N/A
Docker Engine 1.12.3 N/A N/A
NVCC 8.0.44 N/A 8.0.44
NVIDIA Docker 1.0.0 rc.3 N/A N/A
NVIDIA Driver 367.48 N/A 367.48
CPU Intel Xeon E5-1630, 3.7 [GHz] ×8 [core]
Memory 32 [GB]
Local SSD 128 [GB]
GPU NVIDIA Tesla K40
Execution time 23
7353.80
1646.09
7850.57
1638.05
0
1500
3000
4500
6000
7500
9000
CPU (MPI) GPU
Tim
e [
sec]
Native Docker
+6.32 % slower
Profile Result (CPU time) 24
Process native [sec] docker [sec] diff Ratio (all)
FFT3D 7.40E+04 7.63E+04 +3.01% 76.84%
MPIDP-Master 8010.98 8325.9 +3.78% 8.38%
Create Voxel 3743.7 3993.29 +6.25% 4.02%
FFT Convolution 3551.08 3576.43 +0.71% 3.60%
Score Sort 2462.61 2459.7 -0.12% 2.48%
Output Detail 2139.94 2225.96 +3.86% 2.24%
Ligand Preparation 1035.51 1849.11 +44.00% 1.86%
MPI_Barrier 236.95 231.05 -2.55% 0.23%
MPI_Init 0.94 4.54 79.30% 0.00%
… … … … …
(a) MEGADOCK-Azure[2]
Measurement
• megadock-dp exec. time
• time command (3 times, median)
Dataset
• ZDOCK benchmark 1.0 [1]
(59 * 59 = 3481 pairs)
Options (OpenMP, OpenMPI)
• MPI : 12 threads / 4 MPI process / 1 node
All file input/output in Local SSD
Overview of Experiment II-(a) 25
Virtual
Machine
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
VM
MPI
MPI
MPI
MPI
Master Process
Worker Process
(Other)
[1] R. Chen, et al. “A protein-protein docking benchmark,” Proteins: Structure,
Function and Genetics, vol. 52, no. 1, pp. 88-91, 2003.
[2] Masahito Ohue, et al. ”MEGADOCK-Azure: High-performance protein-protein interaction prediction system on Microsoft Azure HPC”, IIBMP2016.
(b) MEGADOCK + Docker on Microsoft Azure
Measurement
• megadock-dp exec. time
• time command (3 times, median)
Dataset
• ZDOCK benchmark 1.0(59 * 59 = 3481 pairs)
Options (OpenMP, OpenMPI)
• MPI : 12 threads / 4 MPI process / 1 node
All file input/output in Local SSD
Docker Swarm
• All Containers in 1 overlay network
Overview of Experiment II-(b) 26
Virtual Machine
Docker
MPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
DockerMPI
MPI
MPI
MPI
Docker Swarm
(Docker Network)
Master Process
Worker Process
(Other)
[1] R. Chen, J. Mintseris, J. Janin, and Z. Weng, “A protein-protein docking benchmark,”Proteins: Structure, Function and Genetics, vol. 52, no. 1, pp. 88-91, 2003.
VM Instance/Software Specification 27
Software Env. Virtual Machine Docker
OS (image) SUSE Linux Enterprise Server 12 ubuntu:14.04
Linux Kernel 3.12.43 3.12.43
GCC 4.8.3 4.8.4
FFTW 3.3.4 3.3.5
OpenMPI 1.10.2 1.6.5
Docker Engine 1.12.6 N/A
VM Instance Standard_D14_v2
CPU Intel Xeon E5-2673, 2.40 [GHz] × 16 [core]
Memory 112 [GB]
Local SSD 800 [GB]
Execution time 28
145,534
25,515
13,132
6,006 4,098
117,219
25,145
12,331
6,344 3,971
0
25,000
50,000
75,000
100,000
125,000
150,000
1 5 10 20 30
Tim
e [
se
c]
# of VMs
VM Docker on VM
May be a measurement mistake
Scalability (Strong Scaling, based VM=1) 29
0
5
10
15
20
25
30
35
40
45
0 100 200 300 400 500
Sp
ee
d-u
p
# of worker cores
Ideal VM Docker on VM
VM=5
VM=1
VM=10
VM=20
VM=30
comparable scalability
Experiment I• MEGADOCK + Docker on Physical Machine
showed 6.32% lower performance.
• Docker can cause 0-4% compute-performance down[1]
• Communications via Docker NAT (Network Address Translation)
• MEGADOCK (GPU) + NVIDIA-Docker on Physical Machineshowed comparable performance to native.• GPU calc. is independent from container virtualization
• Container virtualization has few overhead on memory bandwidth
Experiment II• MEGADOCK + Docker on Microsoft Azure
performed comparable scalability.• Container virtualization overhead is smaller than other cloud environment factor
Result & Discussion 30
[1] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated performance comparison of virtual
machines and Linux containers”, IEEE International Symposium on Performance Analysis of Systems
and Software, pp.171-172, 2015. (IBM Research Report, RC25482 (AUS1407-001), 2014.)
• Performance overhead of Docker container-virtualization is small.• suitable for GPU-accelerated-App and Cloud Environment
• Container-Virtualization can isolate application environment from host environment.• same container image can be used on various machines
• Physical machine on local environment
• Virtual machine on cloud environment
• Docker is useful for computational research work
Conclusion 31
Multi-Node & Multi-GPU Evaluation on Cloud• NVIDIA-Docker is not available on Docker Swarm mode
• Kubernetes[1] officially support 1GPU/1node
• (experimental-feature: multi-GPU support)
Container-based Task Distribution• Web-Service-Application like container-based distribution
• easy to scale computing resource
• easy to extends multiple task (e.g. GHOST-MP, MEGADOCK)
Future Work 32
[1] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, Omega, and
Kubernetes,” acmqueue, vol. 14, no. 1, p. 24, 2016.