25
Reference genome assemblies: Resources & updates from the GRC Valerie Schneider, Ph.D. NCBI 16 July 2016 https://genomereference.o

TAGC2016 schneider

Embed Size (px)

Citation preview

Page 1: TAGC2016 schneider

Reference genome assemblies:

Resources & updates from the GRCValerie Schneider, Ph.D.

NCBI16 July 2016

https://genomereference.org

Page 2: TAGC2016 schneider

Outline

Outline GRC Introduction Assembly updates Assembly resources

https://genomereference.org

Page 3: TAGC2016 schneider

https://genomereference.org

Twitter: @[email protected]

Page 4: TAGC2016 schneider
Page 5: TAGC2016 schneider

Outline GRC Introduction Assembly updates Assembly resources

https://genomereference.org

Outline

Page 6: TAGC2016 schneider

Assembly (e.g. GRCm38)

Primary Assembly

Unit(C57BL/6J)

Non-nuclear assembly unit

(e.g. MT)

129S6/SvEvTac

129S1/SvImJ

129X1/SvJ

NOD/ShiLtJ

NOD/MrkTac

PAR

Genomic Region(MHC)

Genomic Region

(DiGeorge)Genomic

Region(Ren2)

GRCm38 Alternate Loci Strains

A/JAKR/J

BALB/cCAST/Ei

129S6/SvEvTac129P2/OlaHsd129S2/SvPas129S1/SvImJ

129X1/SvJ129S7/SvEvBrd-Hprt-b-m2

NOD/MrkTacNOD/ShiLtJ

RIII

Assembly Model

Page 7: TAGC2016 schneider

Assembly Updates

Assembly (e.g. GRCm38.p5)

Primary Assembly

Unit(C57BL/6J)

Non-nuclear assembly unit

(e.g. MT)

129S6/SvEvTac

129S1/SvImJ

129X1/SvJ

NOD/ShiLtJ

NOD/MrkTac

PAR

Genomic Region(MHC)

Genomic Region

(DiGeorge)Genomic

Region(Ren2)

Patches

Genomic Region(Sftpb)

Genomic Region

(Nlrp4g)Genomic

Region(Meg3)

Patches

FIX NOVEL

SCAFFOLD STATUS AT NEXTMAJOR ASSEMBLY RELEASE

ALT LOCI

--(integrated)

Page 8: TAGC2016 schneider

Assembly Updates: Mousehttp://geval.sanger.ac.uk/index.html

http://www.ncbi.nlm.nih.gov/tools/gbench/

Page 9: TAGC2016 schneider

GRCm38.p5 Fix Patches(GCA_000001635.7)

Rims1Traf5

Ptpmt1Spata5I1

Auts2Jakmip3

Muc2Rab3aIfi30

Nadk2Ide

Assembly Updates: Mouse

GRCm39?

https://genomereference.org

Page 10: TAGC2016 schneider

INSDC Submitted Assemblies

Mouse strains poster:M5101B Will Chow

Page 11: TAGC2016 schneider

Assembly Updates: Zebrafish

GRCz11: Planned for the end of 2016

• Finish remaining clones & integrate into assembly

• Find “missing genes”• Resolve path issues• Integrate WGS into

assembly gaps• Create alternate loci for

haplotypic duplications & indels affecting gene models

• Poster: Z6085A (K. Howe)

Page 12: TAGC2016 schneider

Assembly Updates: Zebrafish

GRCz11: Planned for the end of 2016

• WTSI -> ZFIN transition• Curation: Active ->

Passive• Patch releases• Annotation: Manual ->

Automated• Ensembl• Refseq

Page 13: TAGC2016 schneider

Outline GRC Introduction Assembly updates Assembly resources

https://genomereference.org

Outline

Page 14: TAGC2016 schneider

Assembly Resources

https://genomereference.org

Page 15: TAGC2016 schneider
Page 16: TAGC2016 schneider

http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/mouse/issues/?id=MG-117

Page 17: TAGC2016 schneider
Page 18: TAGC2016 schneider
Page 19: TAGC2016 schneider

https://www.ncbi.nlm.nih.gov/Assembly name, assembly accession, organism name

Page 20: TAGC2016 schneider

(latest RefSeq)

Page 21: TAGC2016 schneider
Page 22: TAGC2016 schneider
Page 24: TAGC2016 schneider

Acknowledgements

GRC SAB• Rick Myers• Granger Sutton• Evan Eichler• Jim Kent• Roderic Guigo• Carol Bult• Derek Stemple• Jan Korbel• Liz Worthey• Matthew Hurles• Richard Gibbs

GRC• Tina Graves-Lindsay• Kerstin Howe• Richard Durbin• Paul Flicek• Laura Clarke• Monte Westerfield• Deanna Church• Curators!• Developers!

GRC Mouse/Zfish Collaborators• NCBI RefSeq/Gene• HAVANA annotators• Peter Lansdorp• Mark Hills• Derek Stemple• David Page• WTSI NOD Idd team

NCBI Support• Genome Browser team• Assembly DB• Gpipe annotation team• Clone DB• Remapping Service

https://genomereference.org

For more info:poster M5055A

Page 25: TAGC2016 schneider

Utilizing NCBI Databases for Model Organism Research

News: www.ncbi.nlm.nih.gov/news/Contact us: [email protected]

Time Topic Poster Number

8:00 – 8:25 The 3 W’s of Sequence Data Submission: What, Where, and WhenIlene Mizrachi —

8:25 – 8:45Reference genome assemblies: resources and updates from the GRCValerie Schneider

M5055/A

8:45 – 9:10How to annotate for 300 species: the awesome power of NCBI’s eukaryotic genome annotation pipelineTerence Murphy

D1524/B

9:10 – 9:35 An introduction to NCBI’s RefSeq and Gene resourcesTripti Gupta

M5104/B

9:35 – 9:55 Optimizing use of NCBI databases to analyze your favorite geneNuala O’Leary

Z6088/A