43
Major germplasm data sources and referatories Dr. Vassilis Protonotarios Agro-Know Technologies, Greece e-Conference on Germplasm Data Interoperability Session 3: “Setting up an infra for the Germplasm Data” Dr. Guntram Geser Salzburg Research, Austria

Major germplasm data sources and referatories

Embed Size (px)

DESCRIPTION

Presentation of some of the major germplasm data sources, including aggregators, networks and individual data providers. Information based on the agINFRA Dossier on Germplasm Data sources (available at http://wiki.aginfra.eu/index.php/Germplasm_Working_Group) Presented during Session 3 of the 1st International e-Conference on Germplasm Data Interoperability (https://sites.google.com/site/germplasminteroperability/)

Citation preview

Page 1: Major germplasm data sources and referatories

Major germplasm data sources and referatories

Dr. Vassilis ProtonotariosAgro-Know Technologies, Greece

e-Conference on Germplasm Data InteroperabilitySession 3: “Setting up an infra for the Germplasm Data”

Dr. Guntram GeserSalzburg Research, Austria

Page 2: Major germplasm data sources and referatories

Structure of the presentation

1. Introduction2. Germplasm data aggregators3. Germplasm data sources4. Conclusions

Page 3: Major germplasm data sources and referatories

Aim of presentation

• To provide an overview of the major germplasm data sources– Not possible to cover all of them– Information mostly based on “agINFRA Dossier on

Germplasm Information” (2012)

Page 4: Major germplasm data sources and referatories

Germplasm data aggregators: Genesys

Page 5: Major germplasm data sources and referatories

About Genesys

• Developed by Bioversity International on behalf of – the System-wide Genetic Resources Programme of

the CGIAR, – the Global Crop Diversity Trust and – the Secretariat of the International Treaty on Plant

Genetic Resources for Food and Agriculture.• Genesys was launched on 13 May 2011

URL: http://www.genesys-pgr.org/

Page 6: Major germplasm data sources and referatories

Data aggregation

1. SINGER (Systems-Wide Information Network for Genetic Resources)

2. EURISCO (European National Germplasm Inventories)

3. GRIN (Germplasm Resources Information Network) of USDA

Page 7: Major germplasm data sources and referatories

Genesys data

• Provides access to almost 2,5M germplasm accession in 356 institutions of 238 countries.– This covers about one third of the genebank

accessions estimated to be held worldwide. • Contains over 11 million Characterization and

Evaluation (C&E) records • Environmental data records for the over

600,000 geo-referenced sites where accessions were collected.

Page 8: Major germplasm data sources and referatories

Metadata model used

• Schema based on MCPD– Adopted by SINGER & EURISCO

• Expanded to include Characterization & Evaluation data (C&E) – As used by GRIN and CGIAR genebanks

Page 9: Major germplasm data sources and referatories

Germplasm data aggregators: EURISCO

Page 10: Major germplasm data sources and referatories

About EURISCO

• EURISCO: based on a European network of ex situ National Inventories (NIs) that makes the European plant genetic resources data available everywhere in the world.

• Maintained by Bioversity International on behalf of the Secretariat of the European Cooperative Programme for Plant Genetic Resources (ECPGR) in collaboration with the National Focal Points for the National Inventories.

URL: http://eurisco.ecpgr.org

Page 11: Major germplasm data sources and referatories

Data aggregation

1. National Focal Points– the unique link between the EURISCO and the

European National Inventories (NIs) and national documentation systems.

2. National Inventories– A number of European countries have

established national PGR inventories that are available on the web.

Page 12: Major germplasm data sources and referatories

EURISCO germplasm data

• The EURISCO Catalogue contains passport data on more than 1.1M samples of crop diversity– representing >5,600 genera and >36,00 species

(genus-species combinations including synonyms and spelling variants)

– from 43 countries*as of May 2012

Page 13: Major germplasm data sources and referatories

Metadata model used

• Schema based on MCPD– Adopted by SINGER & EURISCO

• Expanded to include Characterization & Evaluation data (C&E) – As used by GRIN and CGIAR genebanks

Page 14: Major germplasm data sources and referatories

Germplasm data aggregators: Global Biodiversity Information Facility (GBIF)

Page 15: Major germplasm data sources and referatories

About GBIF

• An international open data infrastructure, funded by governments.

• Operates through a network of nodes, coordinating the biodiversity information facilities of participant countries and organizations, collaborating with each other and the Secretariat

URL: http://www.gbif.org

Page 16: Major germplasm data sources and referatories

Data aggregation

• Data aggregated from almost 750 data sources, including– Laboratories– Research centers– Corporations– Museums– NGOs– Universities

Page 17: Major germplasm data sources and referatories

GBIF data

Page 18: Major germplasm data sources and referatories

Metadata model used

• Dataset level: GBIF Metadata Application Profile– http://www.gbif.org/resources/2559

• Data level: – Darwin Core extension for germplasm (DwC-G)– Ecological Metadata Language (EML)– MCPD / EURISCO descriptors– Others?

Page 19: Major germplasm data sources and referatories

USDA Germplasm Resources Information Network (GRIN)

Page 20: Major germplasm data sources and referatories

About GRIN

• Developed by the U.S. Department of Agriculture / Agricultural Research Service

• Aims to acquire, characterize, preserve, document, and distribute to scientists, germplasm of all lifeforms important for food and agricultural production.

URL: http://www.ars-grin.gov/

Page 21: Major germplasm data sources and referatories

Data aggregation

• Data aggregated from more than thirty germplasm USDA / ARS data sources, including– Arctic and Subarctic Plant Gene Bank Research

centers– Desert Legume Program– Forest Service National Seed Lab– Maize Genetic Stock Center– National Arid Land Plant Genetic Resources Unit

Page 22: Major germplasm data sources and referatories

GRIN data

• Data available though GRIN includes– Passport, – Characterization & Evaluation, – Inventory and – Distribution data

• > 500,000 accessions (distinct varieties of plants) in the GRIN database. – Representing >10,000 species of plants

Page 23: Major germplasm data sources and referatories

Metadata model used

• Passport information– Crop independent– Own schema*

• Crop descriptors**– Crop specific

*http://www.ars-grin.gov/npgs/pcgrin/manual/genlinfo.htm** http://www.ars-grin.gov/npgs/pcgrin/manual/concepts.htm

Page 24: Major germplasm data sources and referatories

ECPGR Germplasm Data

Page 25: Major germplasm data sources and referatories

ECPGR Germplasm Databases

• European Cooperative Programme for Plant Genetic Resources (ECPGR)– founded in 1980 on the basis of the

recommendations of • the United Nations Development Programme (UNDP),• the Food and Agriculture Organization of the United

Nations (FAO) and • the Genebank Committee of the European Association

for Research on Plant Breeding (EUCARPIA).

URL: http://www.ecpgr.cgiar.org/germplasm_databases.html

Page 26: Major germplasm data sources and referatories

ECPGR Data sources

• 64 ECPGR Central Crop Databases have been established by individual institutes and the ECPGR Working Groups.

• The databases hold passport data and to varying degrees, characterization and primary evaluation data of the major collections of the respective crops

Page 27: Major germplasm data sources and referatories

ECPGR germplasm data

• ECPGR offers Web access to specific crop and multi-crop databases:1. ECPGR Central Crop Databases and other Crop Databases

ECPGR and other Central Crop Databases have been established through the initiative of individual institutes and of ECPGR Working Groups. The databases hold passport data and, to varying degrees, characterization and primary evaluation data of the major collections of the respective crops in Europe.

2. Germplasm Collecting Missions Database3. International Multi-crop Databases4. National Multi-crop Databases

Page 28: Major germplasm data sources and referatories

Crop Genebank Knowledge Base

Page 29: Major germplasm data sources and referatories

Crop Genebank Knowledge Base

• An initiative of the System-wide Genetic Resources Programme (SGRP) of the Consultative Group on International Agricultural Research (CGIAR).

• Developed as part of the World Bank funded project “Collective Action for the Rehabilitation of Global Public Goods in the CGIAR Genetic Resources System, Phase 2 (GPG 2)”.

URL: http://cropgenebank.sgrp.cgiar.org

Page 30: Major germplasm data sources and referatories

CGKB data

• A user-friendly online access to procedures, standards and practices for managing clonally propagated and seed crops held in genebanks.

• Best practices in the framework of a learning platform.• Links to other related information and training

resources.• A mechanism to update the existing best practices for

crop management in genebanks and to develop best practices for additional crops.

• Build capacity of genebank curators and technicians.

Page 31: Major germplasm data sources and referatories

European Genebank Integrated System (AEGIS)

Page 32: Major germplasm data sources and referatories

European Genebank Integrated System

• Developed by the European Cooperative Programme for Plant Genetic Resources (ECPGR)

• Supports the coordination of plant genetic resources for food and agriculture (PGRFA)

URL: http://cropgenebank.sgrp.cgiar.org

Page 33: Major germplasm data sources and referatories

AEGIS data

• The European Collection– operated as a virtual European genebank, – composed of European Accessions conserved for

the long-term by the AEGIS Associate Members on behalf of the ECPGR Member countries and being available for use or conservation only for the purposes of research, breeding and training for food and agriculture.

Page 34: Major germplasm data sources and referatories

NordGen - Nordic Genetic Resource Center

Page 35: Major germplasm data sources and referatories

NordGen

• Nordic organization supporting the coordination of Nordic plant genetic resources for food and agriculture (PGRFA)

• Joint initiative of all Nordic countries– Denmark, – Finland, – Iceland, – Norway, – Sweden

URL: http://www.nordgen.org

Page 36: Major germplasm data sources and referatories

NordGen SESTO

• SESTO Genebank Documentation System– A genebank management tool developed by the

Nordic Gene Bank (today Nordic Genetic Resource Center, NordGen).

– Developed into a more generic PGR information system

– Adopted for management and presentation of data from other genebanks in other parts of the world.

Page 37: Major germplasm data sources and referatories

NordGen Data

• Information available for:– Trait Datasets– Trait Descriptors– Germplasm accessions– Observations

Page 38: Major germplasm data sources and referatories

Conclusions

Page 39: Major germplasm data sources and referatories

Conclusions (1/2)

• Germplasm data available from several sources– Global aggregators– National Inventories/aggregators– Individual data sources

• Different metadata used in each case– Highlights the need for harmonization of

standards– Standard vocabularies will allow linked data

approach

Page 40: Major germplasm data sources and referatories

Conclusions (2/2)

• Aggregation of metadata facilitates harmonization– Application of a common standard for several data

sources• Exposure of metadata as linked data per

aggregator

Page 41: Major germplasm data sources and referatories

Next steps

• Complete pending mapping between existing standards– Work of bioinformatics experts

• Define common vocabularies– Work of germplasm experts

• Deploy a linked germplasm data framework– agINFRA could help with that

Page 42: Major germplasm data sources and referatories

References

• Geser, Guntram (2012) “agINFRA Dossier on Germplasm information”. Available online at: http://wiki.aginfra.eu/index.php/Germplasm_Working_Group

• Websites of data sources mentioned in the presentation

Page 43: Major germplasm data sources and referatories

Source: http://verastic.com/social/why-do-people-not-say-thank-you.html

Contact me: [email protected]