Upload
the-bioteam-inc
View
168
Download
0
Embed Size (px)
Citation preview
Fast Genomic Sequence Searches with Symmetric Implementation of Parallel Blast
Bhanu Rekepalli (BioTeam) Eduardo Ponce and Greg Peterson (UTK)
BICoB 2015, March 10, 2015
“The Freedom To Discover”
3
BioTeam
• Independent consulting firm
• Staffed by scientists forced to learn IT, SW & HPC to get our own research done
• Assess, Design, Implement & Train
• Bridging the “gap” between science, IT & high performance computing since 2002
• Skilled Bio-‐IT Evolutionary Anthropologists (> 400 studied in the last year)
Outline
• The genomics data problem • Highly-‐scalable parallel wrapper • Parallel BLAST on Xeon Phi • Optimizations • Performance evaluation • Conclusion • Future work
The genomic data problem
• Advances in next-‐generation sequencing techniques are producing complete genomes at faster rates than data analysis can process
• Data is managed by community-‐centered databases (updated routinely) • e.g., GenBank, EMBL, NR, PDB
• Challenge: Bioinformatics research requires high-‐throughput processing and analytic tools to sustain the exponential growth in the genomic data Add the fact that HPC Is difficult to utilize
• Solution: modify algorithms and frameworks to allow scalable analytics in modern architectures
Intel Xeon Phi
• A many integrated core (MIC) for massive parallelism • Programming models
• Native – all code on MIC • Offload – main code on CPU, other on MIC • Symmetric – both CPU and MIC using message passing • Upload – main code on MIC, other on CPU
Case study: NCBI BLAST
• The “Swiss Army knife” of biologists • BLAST (Basic Local Alignment Search Tool) aligns genomic chains of amino acids using fast heuristic algorithms to find regions of local similarity. • Compares query sequences to sequence databases and calculates the statistical significance of matches.
• Sequencing programs: blastp, blastn, blastx, psi-‐blast
• Query file and formatted database (FASTA format)
Highly-‐scalable parallel wrapper • HSP-‐wrap • Software framework for scaling life science informatics applications to HPC environments via task parallelism • Bioinformatics and chemoinformatics domains • Portable à written in C/C++ and MPI • Load balance, parallel output, fault-‐tolerance, check-‐pointing
• Successfully ported tools • BLAST, HMMER, MUSCLE • DOCK6, AutoDock Vina, LINUS
HSP-‐wrap: architecture The Wrapper Approach
Database(NR,
Pfam, …)
InputQueries
Results 1
Results 2
Results N
Lustre FS
Database
Query Block 1…
Query Block M
Output Buffer
Tool Process 1(BLAST, DOCK6, HMMER, …)…
Compression
Data-‐base
Query Block
Worker Nodes [1..N]
Master Node
CompressedBuffer
Main Memory
Tool Process P(BLAST, DOCK6, HMMER, …)
Main Memory
Preload Database
…
Bhanu Rekapalli et. al. BMC Bioinformatics 2013, 14: S3
HSP-‐wrap: memory management
• stdiowrap – module for file management • Function interposition to standard I/O calls • Minimal modification to original code (if any) • Input file management • Files are mapped to main memory on-‐demand • Tracks parallel reads
• Output file management • Double buffered parallel support • Minimizes number of data transfers
In symmetric execution, both Xeon and Xeon Phi processors used as network hosts for distributed processing.
Symmetric HSPH-‐BLAST
Xeon/Phi Configuratio
n
Input sequence
s
Worker nodes:[Xeon, Phi]
Physical cores:[Xeon, Phi] =
Total 3x_8p 17000 [2, 8] [48, 488] = 536
5x_16p 34000 [4, 16] [80, 976] = 1056
9x_32p 68000 [8, 32] [144, 1952] = 2096
17x_64p 136000 [16, 64] [272, 3904] = 4176
Weak scaling parameters
• Parallel wrappers can be adapted to current informatics applications to greatly improve processing throughput and scalability on supercomputing platforms due to similar programming models and I/O characteristics.
• The wrapped tools can be used to identify species, perform DNA mapping, infer on functional and evolutionary relationships between sequences as well as help identify members of gene families.
• Symmetric weak scaling studies showed linear speedup and balanced workload distribution for course-‐grained parallelization of BLAST.
• Finer-‐grained vectorization of the BLAST process could improve utilization of Xeon Phi processors, thus we are collaborating with Intel engineers and NCBI BLAST developers to address this.
Conclusions
Future work
• Extend Highly-‐Scalable Parallel Hybrid Software Wrapper • Replication and fragmentation schemes for large data
management • Input models: dot/cross product and hybrid approaches
• Make tools available as standard software modules on HPC architectures
• Integrate HSP-‐tools into scientific workflow pipelines to provide fast processing for high-‐impact scientific discovery • Incorporate with web-‐enabled science gateways
• Optimize NCBI BLAST for Xeon Phi processors and scale it with HSP-‐Wrap
• Adapt parallel wrappers to other primary tools used in informatics fields of the life sciences
• Build data analysis pipelines for novel data mining and large-‐scale knowledge discovery