Upload
vuongtram
View
216
Download
3
Embed Size (px)
Citation preview
CASE STUDIES – MEGADOCK, a bioinformatics application
High-performance protein-protein docking with MEGADOCK 4.0 on Azure CloudPowerful calculation of huge amount of protein-protein interactions by using MEGADOCK 4.0
on Azure HPC.
The elucidation of protein-protein interaction (PPI) networks is important for understanding cellular
structure/function and accelerating structure-based drug design. However, the development of an
effective method to conduct computational approach for exhaustive PPI screening has long been a
represents a computational grand challenge.
Akiyama laboratory, Tokyo Institute of Technology (Tokyo Tech), Japan developed exhaustive
structure-based protein-protein interaction prediction software named MEGADOCK 4.0 which
conducts tertiary structural docking approach based on shape complementarity and physicochemical
properties in a massively parallel fashion. MEGADOCK version 4.0 can perform the CPU/GPU
heterogeneous computing environments and show the shows powerful, scalable performance of
>97% strong scaling up to 600,000 CPU cores on the world-leading supercomputers. Naturally,
MEGADOCK is also well performed on cloud environment such as Microsoft Azure HPC.
The Tokyo Tech group showed s that MEGADOCK 4.0 4.0 performed on 50 computing node (DS14
instances) on Microsoft Azure, totaling 600 cores and 5.5 TByte RAMs, with >90% excellent strong
scaling. Azure HPC provides stable and secure computing resource environment and for biological
knowledge by MEGADOCK.biologists and pharmaceutical scientists working on PPIs.
CASE STUDIES – GHOST-MP, a bioinformatics application
GHOST-MP on Azure Cloud accelerates whole genome shotgun metagenomic analysisMicrobial flora on metagenomic samples can be analyzed by using GHOST-MP on Azure HPC.
GHOST-MP on Azure HPCPerformance Graph (Strong Scaling)
Database : NCBI nr database (20 GB)Query : metagenomicsample from human buccal mucosa
500
1000
1500
2000
2500
100 150 200 250 300 350 400 450 500
Spee
dup
(read
/sec)
No. of worker cores (on DS14 Azure VMs)
2.28x faster than #VM=10(strong scaling = 0.761)
#VM=10
#VM=20
#VM=30
Metagenomics is the study of the genomes of uncultured microbes obtained directly from microbial
flora in their natural habitats. Such analyses have recently become more popular and important as the
throughput of DNA sequencers has increased. Especially, whole-genome shotgun (WGS)
sequencing, carried out using next-generation sequencing (NGS) technologies, produces huge
amounts of metagenomic data which enables us to uncover an abundance of orthologous groups, i.e.,
the distribution of gene/protein functions, in environmental samples.
GHOST-MP is a massively parallel sequence homology search tool developed by Akiyama
laboratory, Tokyo Institute of Technology (Tokyo Tech), Japan, for functional annotation of
metagenome sequences. Although BLAST is the golden standard homology search tool, GHOST-MP
is more than 160 times faster than BLAST with single CPU core and has sufficient search sensitivity
for metagenome analysis. In addition, GHOST-MP is well performed parallel computing
environments such as Microsoft Azure HPC. The Tokyo Tech group runs GHOST-MP performed on
30 computing node (DS14 instances) in Microsoft Azure, totaling 480 cores and 3.3 TByte RAMs.
Tokyo TechAzure HPC enables efficient runs the metagenomic metagenome analysis of microbial
metagenomic samples and unravels the unknown functions of microbial flora.