Upload
james-hadfield
View
3.450
Download
3
Embed Size (px)
Citation preview
‘How to prepare, cluster and sequence an NGS library’
AN OVERVIEW OF NGS IN THE GENOMICS CORE– Introduction– Understanding library prep– Understanding clustering and sequencing– Understanding instruments– NGS QC– NGS applications
A potted history of Illumina sequencing
Understanding library prep
Adapter ligation
End-repairAdenylation
BioAnalyserqPCRPCR
“The next ten slides are the most important I’ll show today “
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
Understanding library prep
– Text
Illumina adapters ask for Illumina letter!
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
ADAPTER
PCR PRIMER
SEQ PRIMER
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
CAAGCA
GAAGAC
GGCATA
CGAGCTCTTCCGATCT
Insert DNA
Index
PCR
Index SEQ
ACACTC
TTTCCC
TACACG
ACGCTCTTCCGATCT
InsertDNA A
||||||||||InsertDNAACTCGTATGCCGTCTTCTGCTTG P-GATCGGAAGAG
ACACTCTTTCCCTACACGACG CTCTTCCGATCT T
||||||||||
CTCGTATGCCGTCTTCTGCTTGP-GATCGGAAGAG
ACACTC
TTTCCC
TACACG
ACGCTCTTCCGATCT
T||||||||||
Oligonucleotide sequences © Illumina, Inc. All rights reserved.
Understanding library prep
“The next ten slides are the most important I’ll show today “
The library prep spike
[DN
A]
Illumina Processing
Understanding library prep – Nextera!
– Text
Understanding cluster generation (2500 etc)
Understanding cluster generation (2500 etc)
A) Diluted & denatured libraries are annealed to lawn oligos at their 3’ end, and a polymerase creates a covalently attached copy of the library molecule.B) The original strand is removed by denaturation with NaOH.C) In non-denaturing conditions the library molecule bends and hybridises to a lawn oligo complementary to the 5’ end, and a polymerase creates a second covalently attached molecule. This amplification is repeated to create a cluster with around 1000 copies of the original library molecule.
A B C
Understanding cluster generation (2500 etc)
D E C G H
D) Clusters are linearized by cleavage at the 3’ end of the original library molecule, and denaturation leaves the single stranded DNA which will be sequenced. A sequencing primer is hybridised* and sequencing-by-synthesis generates the first read in your fastq file.-) For single-end indexing the the SBS template is removed by denaturation, and the index 1 sequencing primer is hybridised ready to generate index1 (i7). Dual-indexing is complicated and differs on single- or paired-end flowcells but the process is essentially the same to generate index two (i5).E-G) For paired-end sequencing the SBS template is removed by denaturation, the cluster is re-amplified for several cycles, cleaved at the 5’ end the paired-end sequencing primer hybridised ready to generate read 2.
*Beware: if you create new adapters let us know if you need a custom sequencing primer
Understanding cluster generation (X Ten & 4000)Exclusion Amplification
The same hybridisation and solid-surface amplification occurs but in an all-in-one phase called “exclusion amplification” (ExAmp). Once a library molecule “lands” in a well it should occupy it completely.
Understanding cluster generation (X Ten & 4000)Exclusion Amplification
Understanding sequencing
Understanding sequencing: Sanger-seq
Understanding sequencing: Pyro-seq
Understanding sequencing: Sequencing-by-synthesis
Understanding “sequencing by synthesis”
Understanding “sequencing by synthesis”
Instrument “colours”
HiSeq, MiSeq 4-colour SBS
NextSeq 2-colour SBS
Firefly 1-colour SBS?
Instruments explained – HiSeq 2500 & 4000
Different sequencing configurations
2500 Rapid150M readsSE 50bp
85%Q30PE 250bp
75%Q30PE 150 2 days
2500 High output250M readsSE 50bp
85%Q30PE 125bp
80%Q30PE 125 6 days
4000 High output312M readsSE 50bp
85%Q30PE 150bp
75%Q30PE 150 3 days
HiSeq 4000 considerations
CLUSTERING IS VERY DIFFERENT FROM 2500– PE150 - >125 is not great*– %Q30 “passes Illumina spec”*
– ExAmp duplicates*– Need to consider how you handle duplicates
– RNA-seq is fine– Exome-seq is fine– Genomes are fine
Instruments explained - MiSeq
~600bp fragments
+/- 50bp overlap
300bp reads
Instruments explained - NextSeq
NGS QC – library prepQUALITY CONTROL OF LIBRARIES IS IMPORTANT.
TITRATION FLOWCELLS AND FAILED RUNS ARE EXPENSIVE.TRY TO IDENTIFY ISSUES BEFORE RUNNING ANY LANES.QC IS SPECIFIC TO YOUR SAMPLES.
QUANTITATION OF LIBRARIES IS IMPORTANT.SOME QC CAN ONLY BE DONE ONCE YOU HAVE GENERATED DATA
Good
Bad
Bioanalyser qPCR Analysis
NGS QC – FastQC
NGS QC – MGA
NGS QC – MGALIBRARY QC – CONTAMINANT DETECTION
SAMPLE 100,000 READS FROM FASTQREADS TRIMMED TO 36BPALIGN TO MULTIPLE GENOMES USING BOWTIE
LIBRARY QC – ADAPTER DETECTIONSAMPLE 100,000 READS FROM FASTQREADS CONVERTED TO FASTAALIGN TO “ADAPT-OME” USING EXONERATE
LIBRARY QC- YIELDCOUNT NUMBER OF READS (SINGLE-END ONLY)DISPLAY NUMBER ON A PRE-DEFINED SCALEDISPLAY LANES IN FLOWCELL CONFIGURATION
NGS QC – MGA
NGS QC – MGA
The Genomics Core sequencing services
James Hadfield NEB March 2016
The Genomics Core sequencing services
The Genomics Core sequencing services
Service metrics Jan 2016
– TAT has been 2-3 weeks (often as little as 1 week)– Most sequencing works very well, but…
The Genomics Core sequencing services
The Genomics Core sequencing services
NGS methods
A genomic case report
A genomic case report
NFKBIA S32G
SIFT: deleterious(0)PolyPhen: probably_damaging(0.979)