58
Databases NCBI - ENTREZ

Databases NCBI - ENTREZhelix.mcmaster.ca/diagrams_Sep26.pdf · Mapping Data CDTree NCBI Taxonomy ... EC/RN Number ECNO YES YES YES YES YES

  • Upload
    hatruc

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Databases

NCBI - ENTREZ

Data & Software Resources

BLAST                                                     FTP Site                CDD                                                       Structure (MMDB)COG                                                       Trace ArchiveGENSAT                                                    UniGeneGenBank                                                   UniVecWhole Genome Shotgun Sequences                            GenPept                 Gene                                                      dbGAP Open­Access DataGene Expression Nervous System Atlas (GENSAT)             dbMHC DataGene Expression Omnibus (GEO) Profiles and Datasets       RSS FeedsGenome                                                    SequinGenome Markers (UniSTS)                                   tbl2asnHomoloGene                                                Batch EntrezMapping Data                                              CDTreeNCBI Taxonomy                                             Cn3DProtein Clusters                                          E­UtilitiesPubChem                                                   NCBI ToolboxRefSeq                                                    ProSplignSKY/M­Fish and CGH Data                                   Splign               Sequence Read Archive                                    

Just the upper left corner of “moi”

Just the lower left corner of “moi”

“*” is not a wildcard it is a truncation

Eg #1 #2 NOT #3

Combine Searches

● Use of boolean terms for search– AND– OR– NOT

● General syntax:– term [field] OPERATOR term [field]

● Use of brackets to combine the terms

Accession ACCN YES YES YES YES YESAll Fields ALL YES YES YES YES YESAuthor Name AUTH YES YES YES YES YESEC/RN Number ECNO YES YES YES YES YESFeature Key FKEY YES NO YES NO YESFilter FILT YES YES YES YES YESGene Name GENE YES YES YES NO YESIssue ISS YES YES YES YES YESJournal Name JOUR YES YES YES YES YESKeyword KYWD YES YES YES NO YESModification Date MDAT YES YES YES YES YESMolecular Weight MOLWT NO YES NO NO NOOrganism ORGN YES YES YES YES YESPage Number PAGE YES YES YES YES YESPrimary Accession PACC YES YES YES NO YESProperties PROP YES YES YES NO YESProtein Name PROT YES YES YES NO YESPublication Date PDAT YES YES YES YES YESSeqID String SQID YES YES YES NO YESSequence Length SLEN YES YES YES NO NOSubstance Name SUBS YES YES NO YES NOText Word WORD YES YES YES YES YESTitle Word TITL YES YES YES NO NOVolume VOL YES YES YES YES YES

Field Short term Nucleotide Protein Genome Structure PopSet

Available for Database

Affiliation AD All Fields ALLAuthor AU Corporate Author CNEC/RN Number RN Entrez Date EDATFilter FILTER First author IAUFull Author Name FAU Grant Name GRIssue IP Investigator IRJournal Title TA Language LAMeSH Date MHDA MeSH Major Topic MAJRMeSH Subheadings SH MeSH Terms MHNLM Unique ID JID Other Term OTPagination PG Personal Name as Subject PSPharmacological Action PA Place of Publication PLPublication Date DP Publication Type PTPublisher Identifier AID Secondary Source ID SISubset SB Substance Name NMText Word TW Title TITitle / Abstract TIAB Unique Identifiers UIDVolume VI

Field Short term Field Short term

PubMed ENTREZ search fields

Can you find the enhancers/promoters for GLP3 (GERMIN like-protein 3)?

?

● Range operator “:” (ACCN, MOLWT, SLEN)– x : y [SLEN]– works with dates; molecular weight

● For more information: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html

Summary Default display, hotlinked Accession number and brief description

Nucleotide, Protein, CoreNucleotide, EST, GSS, PopSet, Genome, Genome Project

Brief

Hotlinked Accession number and abbreviated description, hotlinked project number in the case of a genome project

Nucleotide, Protein, CoreNucleotide, EST, GSS, PopSet, Genome, Genome Project

GenBank Full report formatNucleotide, Protein, CoreNucleotide, EST, GSS, Genome

GenPept Full report format Protein

GenBank (full)

Complete GenBank record with all features and all Sequence. This format is useful for very large GenBank records

Nucleotide, Protein, CoreNucleotide, EST, GSS, Genome

GenPept

Complete GenPept record with all protein features and all Sequence. This format is useful for very large GenBank records

Protein

DisplayFormat

Description Databases Available

INSDSeq XML XML DTD for sequence records Nucleotide, Protein

GI list List of GenInfo – GI indentifiers Nucleotide, Protein, CoreNucleotide, EST, GSS,

ASN.1

Abstract syntax Notation One, used data storage and retrieval and to help achieve interoperability among platforms

Nucleotide, Protein, CoreNucleotide, EST, GSS, PopSet, Genome

EST Native display format for Expressed Sequence Tag records

EST

Graphics orGraph

The graphical view of the sequence accessible by selecting the hotlinked Accession numbers

Nucleotide, Protein and Genome

TinySeq XML Simplified XML for parsingNucleotide, Protein, CoreNucleotide, EST, GSS, Genome

DisplayFormat

Description Databases Available

GSSNative Display format for the Genome Survey Sequences GSS

Overview

Tabular-layout of data including Links to BLAST results, CDD, ftp site and general information for a genome in Genomes; for Genome Project database it is a complete display of links to projects in the database, serves as a portal to links to all projects in the database about the organism specific genome

Genome, Genome Project

PopSet summary

The number set of Accession Numbers comprising the PopSet accessible by selecting the hotlinked PopSet Acession Numbers

PopSet

XML Script-parseable formatNucleotide, Protein, Genome

DisplayFormat

Description Databases Available

UI List List of database ID's PopSet

Text mining

Caveat emptor