42
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS NILS GEHLENBORG @nils_gehlenborg http://gehlenborglab.org

Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Embed Size (px)

Citation preview

Page 1: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS

NILS GEHLENBORG

@nils_gehlenborg

http://gehlenborglab.org

Page 2: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

FRITZ LEKSCHAS HARVARD MEDICAL SCHOOL

Page 3: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

BIG PILES OF DATA …

Page 4: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Data Repositories

general specialized

ArrayExpress GEO

Metabolights PRIDE

dbGAP …

ENCODE Roadmap Epigenomics

Page 5: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

… OFFER OPPORTUNITIES …

Page 6: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SINGLE OR FEW DATA SETS

Test hypotheses without generating new data.

Use published data as supporting evidence for findings based on our your own data sets.

MANY DATA SETS

Conduct meta analyses, e.g. characterize expression patterns in human tissues or to link diseases.

Page 7: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

M. Lukk, et al., Nature Biotechnology, 28(4):322–324 (2010)

Page 8: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

S. Suthram et al.,PLoS Computational Biology 6(2)(2010)

Page 9: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SINGLE OR FEW DATA SETS

Test hypotheses without generating new data.

Use published data as supporting evidence for findings based on our your own data sets.

MANY DATA SETS

Conduct meta analyses, e.g. characterize expression patterns in human tissues or to link diseases.

COMMON BEHAVIOR OF RESEARCH PARASITES!

Page 10: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES

Page 11: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES

Page 12: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

! ANALYSIS PIPELINES

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES!

Page 13: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

! ANALYSIS PIPELINES

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES

GALAXY! Toolshed

Workflow Editor

Tools

REST API

Page 14: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

! ANALYSIS PIPELINES

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES

GALAXY! Toolshed

Workflow Editor

Tools

REST API

Workflow Inputs

Workflow Outputs

Page 15: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

N Gehlenborg et al. , manuscript in preparation

!

!

|

DATA REPOSITORY

VISUALIZATION TOOLS

ANALYSIS PIPELINES

http://www.refinery-platform.org

Page 16: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

… BUT NOT SO FAST!

Page 17: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Z

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

Terminal

Root

subc

lass

of

Keywords

Page 18: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Z

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

Terminal

Root

subc

lass

of

Keywords

Page 19: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Z

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

Terminal

Root

subc

lass

of

Keywords

Page 20: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Z

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

Terminal

Root

subc

lass

of

Keywords

Page 21: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Page 22: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Page 23: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Page 24: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Page 25: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Z

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

Terminal

Root

subc

lass

of

Keywords

Page 26: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

X

Semantic VisualExploration

YZ

Text-Based Search

Data Sets

Metadata

Data Files

X Y

Ontologies

Z

A1

X Y

Z

A2A3A4

X Y

Z- -

K K K K

L M L M

Free Text

AnnotationMapping

K

L, M

X, Y

Z

X YZX Y

SATORI

Terminal

Root

subc

lass

of

Keywords

YX

Z

Z

X

Page 27: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories

http://satori.refinery-platform.org

Page 28: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Data set

Repository

Collection of interest

Data Analyst Group Leader Data Curator

Page 29: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Data set

Repository

Collection of interest

Data Analyst Group Leader Data Curator

Page 30: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Data set

Repository

Collection of interest

Data Analyst Group Leader Data Curator

Page 31: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Data set

Repository

Collection of interest

Data Analyst Group Leader Data Curator

Page 32: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Need 1 find data sets that match certain experimental characteristics.

Need 2 find data sets that are similar (or dissimilar) to given data sets.

Need 3 get an overview of the distribution of the experimental characteristics across a collection of data sets.

Need 4 get an overview of the annotation term hierarchy and term usage.

Page 33: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Peter Pirolli and Stu Card

Page 34: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories

http://satori.refinery-platform.org

Page 35: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

C A B

C

List graphB C

B

Tree

Tree map A

A B

C

Data sets

BC

BC

BC

CB

CB

A B

C

Scenario 1:

Scenario 2:

Scenario 3:

AnnotationsTerm

1 2 3 4

Page 36: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories

http://satori.refinery-platform.org

Page 37: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories

http://satori.refinery-platform.org

Page 38: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Page 39: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

The Art Institute of Chicago

Page 40: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

HARVARD MEDICAL SCHOOL

JOHANNES KEPLER UNIVERSITY LINZ Stefan Luger, Holger Stitz, Marc Streit

Web http://satori.refinery-platform.org · http://refinery-platform.org

AcknowledgementsPeter J Park & all members of the Computational Genomics Lab Fritz Lekschas, Jennifer K Marx, Scott Ouellette, Anton Xue, Psalm Haseley

HARVARD SCHOOL OF PUBLIC HEALTH Ilya Sytchev, Shannan Ho Sui

UNIVERSITY OF SHEFFIELD David R Jones, Winston Hide

Funding NIH/NHGRI R00 HG007583, Harvard Stem Cell Institute

Page 41: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

We are hiring postdocs & developers!

HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS

See http://gehlenborglab.org or http://dbmi.med.harvard.edu for details.

Data visualization, analysis, and management for: • genomic structural variants • dynamics of the 3D genome • cancer subtypes in patient cohorts • exploration tools for data repositories • provenance graphs

Page 42: Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

X

B

A

D

A

X XX Term Terminal term To be deleted

AA

X To be duplicated

A A

C

ABA

C

B

C'

0 0 00 5 5 5 5

0 5

1 5

5 10 5 10

Term size Cumulative sizeX1 2

2 7

2 7

1 5

D

C

F D

C

F

F'

1. Global 2. Tree Map 3. Node-Link Diagram

5 10

1 5 1 105 5

0 10

G G

BB

B

C

C

C E EA'C