Robert Stevens Ebi

Embed Size (px)

Citation preview

  • 8/8/2019 Robert Stevens Ebi

    1/25

    http://img.cs.man.ac.uk/stevens 1

    Building and Using Ontologies

    Robert StevensDepartment of Computer Science

    University of Manchester

    Manchester UK

  • 8/8/2019 Robert Stevens Ebi

    2/25

    http://img.cs.man.ac.uk/stevens 2

    Introduction The nature of bioinformatics resources

    W

    hat is knowledge? What is an ontology?

    What are the uses of ontologies?

    Components of an ontology

    Building an ontology (in brief)

  • 8/8/2019 Robert Stevens Ebi

    3/25

    http://img.cs.man.ac.uk/stevens 3

    The Nature of Bioinformatics

    Resources Over 500 databanks and analysis tools that work over

    resources

    Repositories of knowledge and data and generation ofnew knowledge

    Knowledge often held as free text; some use made ofcontrolled vocabularies

    Enormous amount of semantic heterogeneity and poorquery facilities

    Knowledge about services not always apparent

  • 8/8/2019 Robert Stevens Ebi

    4/25

    http://img.cs.man.ac.uk/stevens 4

    What is Knowledge?

    Knowledge all informationand an understanding tocarry out tasks and to infernew information

    Information -- data equippedwith meaning

    Data -- un-interpretedsignals that reach oursenses

    PATRICIAGRACEKENNEDY

    SAIDMINEISAPINT

    Patricia Grace

    Kennedy said

    mine is a pintname noun verb

    Pat Baker is aManchester

    bioinformatician

    who drinks beer.

    CEKENNSingle letter amino

    acid codes

    C cysteine

    K - lysine

    Protein that acts asa tyrosine kinase in

    the liver of primates.

  • 8/8/2019 Robert Stevens Ebi

    5/25

    http://img.cs.man.ac.uk/stevens 5

    Capturing Knowledge Capturing knowledge for both humans an computer

    applications

    A set of vocabulary definitions that capture acommunitys knowledge of a domain

    `An ontology may take a variety of forms, butnecessarily it will include a vocabulary of terms, and

    some specification of their meaning.T

    his includesdefinitions and an indication of how concepts are inter-related which collectively impose a structure on thedomain and constrain the possible interpretations ofterms.'

  • 8/8/2019 Robert Stevens Ebi

    6/25

    http://img.cs.man.ac.uk/stevens 6

    What Does an Ontology Do? Captures knowledge

    Creates a shared understanding between

    humans and for computers

    Makes knowledge machine processable

    Makes meaning explicit by definition and

    context

  • 8/8/2019 Robert Stevens Ebi

    7/25

    http://img.cs.man.ac.uk/stevens 7

    What is an Ontology?

    Catalog/

    ID

    General

    Logical

    constraints

    Terms/

    glossary

    Thesauri

    narrower

    term

    relation Formal

    is-a

    Frames

    (properties)

    Informal

    is-a

    Formal

    instance

    Value Restrs. Disjointness,

    Inverse, part

    of

  • 8/8/2019 Robert Stevens Ebi

    8/25

    http://img.cs.man.ac.uk/stevens 8

    Roles of Ontologies in

    Bioinformatics We can divide ontology use into three types:

    Domain-oriented, which are either domain specific (e.g.

    E. coli) or domain generalisations (e.g. gene function orribosomes);

    Task-oriented, which are either task specific (e.g.annotation analysis) or task generalisations (e.g.

    problem solving); Generic, which capture common high level concepts,

    such as Physical, Abstract and Substance. Important inontology management and language applications.

  • 8/8/2019 Robert Stevens Ebi

    9/25

    http://img.cs.man.ac.uk/stevens 9

    Uses of Ontology Community reference -- neutral authoring.

    Either defining database schema or defining a common

    vocabulary for database annotation -- ontology asspecification.

    Providing common access to information. Ontology-based search by forming queries over databases.

    Understanding database annotation and technicalliterature.

    Guiding and interpreting analyses and hypothesisgeneration

  • 8/8/2019 Robert Stevens Ebi

    10/25

    http://img.cs.man.ac.uk/stevens 10

    Components of an Ontology Concepts: Class of individuals The concept

    Protein and the individual`human cytochrome C

    Relationships between concepts

    Is a kind of relationship forms a taxonomy

    Other relationships give further structure is a

    part of

    Axioms Disjointness, covering, equivalence,

  • 8/8/2019 Robert Stevens Ebi

    11/25

    http://img.cs.man.ac.uk/stevens 11

    Knowledge Representation

    Ontology are best delivered in some computablerepresentation

    Variety of choices with different:

    Expressiveness

    The range of constructs that can be used to formally,

    flexibly, explicitly and accurately describe the ontology

    Ease of use

    Computational complexity

    Is the language computable in real time?

    Rigour -- Satisfiability and consistency of the

    representation

    Systematic enforcement mechanisms

    Unambiguous, clear and well defined semantics

  • 8/8/2019 Robert Stevens Ebi

    12/25

    http://img.cs.man.ac.uk/stevens12

    Languages Vocabularies using natural language

    Hand crafted, flexible but difficult to evolve, maintain and

    keep consistent, with weak semantics

    Gene Ontology

    Object-based KR: frames Extensively used, good structuring, intuitive. Semantics

    defined byOKBC standard

    EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua)

    Logic-based: Description Logics Very expressive, model is a set of theories, well defined

    semantics

    Automatic derived classification taxonomies

    Concepts are defined and primitive

  • 8/8/2019 Robert Stevens Ebi

    13/25

    http://img.cs.man.ac.uk/stevens13

    Building Ontologies No field ofOntologicalEngineering equivalent to

    Knowledge or Software Engineering;

    No standard methodologies for building ontologies;

    Such a methodology would include:

    a set of stages that occur when building ontologies;

    guidelines and principles to assist in the different stages;

    an ontology life-cycle which indicates the relationships

    among stages.

  • 8/8/2019 Robert Stevens Ebi

    14/25

    http://img.cs.man.ac.uk/stevens14

    The Development Lifecycle Two kinds of complementary methodologies emerged:

    Stage-based, e.g. TOVE[Uschold96]

    Iterative evolving prototypes, e.g. MethOntology [Gomez Perez94].

    Most have TWO stages:

    1. Informal stage ontology is sketched out using either natural language descriptions or some

    diagram technique

    2. Formal stage

    ontology is encoded in a formal knowledge representation language, that is

    machine computable

    the informal representation helps the former

    the formal representation helps the latter.

  • 8/8/2019 Robert Stevens Ebi

    15/25

    http://img.cs.man.ac.uk/stevens15

    A Provisional Methodology

    A skeletal methodology and life-cycle for buildingontologies;

    Inspired by the software engineering V-process model;

    The overall process moves through a life-cycle.

    The left side

    charts the

    processes in

    building anontology

    The right side charts the

    guidelines, principles and

    evaluation used to quality

    assure the ontology

  • 8/8/2019 Robert Stevens Ebi

    16/25

    http://img.cs.man.ac.uk/stevens16

    The V-model Methodology

    Conceptualisation

    Integrating existingontologies

    Encoding

    Representation

    Identify purpose and scope

    Knowledge acquisition

    Evaluation: coverage,verification, granularity

    ConceptualisationPrinciples: commitment,conciseness, clarity,extensibility, coherency

    Encoding/Representationprinciples: encoding bias,consistency, house stylesand standards, reasoningsystem exploitation

    Ontology in Use

    User Model

    Conceptualisation Model

    Implementation Model

  • 8/8/2019 Robert Stevens Ebi

    17/25

    http://img.cs.man.ac.uk/stevens17

    The ontology building life-

    cycleIdentify purpose and scope

    Knowledge acquisition

    Evaluation

    Language andrepresentation

    Availabledevelopmenttools

    Conceptualisation

    Integrating

    existingontologiesEncoding

    Building

  • 8/8/2019 Robert Stevens Ebi

    18/25

    http://img.cs.man.ac.uk/stevens18

    Starting Concept List Chemicals atom, ion, molecule, compound, element;

    Molecular-compound, ionic-compound, ionic-molecular-

    compound, ;

    Ionic-macromolecular-compound and ionic-small-

    macromolecular-compound;

    Protein, peptide, polyprotein, enzyme, holoprotein,apoprotein,

    Nucleic acid DNA, RNA, tRNA, mRna, snRNA,

  • 8/8/2019 Robert Stevens Ebi

    19/25

    http://img.cs.man.ac.uk/stevens19

    Conceptualisation SketchChemical

    AtomElementCompoundMolecule Ion

    MetalNon-Metal

    Metaloid

    Molecular

    Compound

    Molecular

    Element

    Ionic

    Compound

    Ionic

    Molecule

    Ionic Molecular

    Compound

  • 8/8/2019 Robert Stevens Ebi

    20/25

    http://img.cs.man.ac.uk/stevens20

    Molecule Conceptualisation

    Sketch

    Nucleic

    Acid

    ProteinPolysaccharide

    DNA RNAEnzyme

    Macromolecule Small

    Molecule

    Ionic Macromolecular

    Compound

    Starch Glycogen

    mRNA tRNA rRNAsnRNA

    Peptide

  • 8/8/2019 Robert Stevens Ebi

    21/25

    http://img.cs.man.ac.uk/stevens21

    Initial Encodingclass-def chemical

    subclass-of substance

    class-def molecule

    subclass-of chemical

    class-def compound

    subclass-of chemical

    class-def molecular-compound

    subclass-of molecule and compound

  • 8/8/2019 Robert Stevens Ebi

    22/25

    http://img.cs.man.ac.uk/stevens22

    Molecules Revisited

    Nucleic

    Acid

    ProteinPolysaccharide

    DNA RNAEnzyme

    Macromolecule Small

    Molecule

    Ionic Macromolecular

    Compound

    Starch Glycogen

    mRNA tRNA rRNAsnRNA

    Peptide

    Non-Ionic Macromolecular

    Compound

  • 8/8/2019 Robert Stevens Ebi

    23/25

    http://img.cs.man.ac.uk/stevens

    23

    More Encodingclass-def chemical

    subclass-of substance

    class-def defined molecule

    subclass-of chemical

    Slot-constraint contains-bond min-cardinality 1 has-value covalent-bond

    class-def defined compound

    subclass-of chemical

    Slot-constraint has-atom-types greater-than 1

    class-def defined molecular-compound

    subclass-of molecule and compound

  • 8/8/2019 Robert Stevens Ebi

    24/25

    http://img.cs.man.ac.uk/stevens

    24

    Expansion Sketch and encode in cycles

    Build a taxonomy of a small portion

    Then build links to other portions

    Add more detail

    Document sources, author, date andargumentation.

  • 8/8/2019 Robert Stevens Ebi

    25/25

    http://img.cs.man.ac.uk/stevens

    25

    Summary An ontology captures knowledge for a shared

    understanding

    The important question is not whether an artefact is anontology, but whether it does any good

    Making our understanding of domain explicit, consistent

    and processable

    Bioinformatics resources are knowledge resources

    needs to be both human and machine understandable