35
- Visual Analytics - The human back in the loop Jan Aerts Biodata Analysis and Visualization Stadius Group, ESAT Leuven University, Belgium [email protected] @jandot http://orcid.org/0000-0002-6416-2717

Visual Analytics talk at ISMB2013

Embed Size (px)

Citation preview

Page 1: Visual Analytics talk at ISMB2013

- Visual Analytics -The human back in the loop

Jan AertsBiodata Analysis and VisualizationStadius Group, ESATLeuven University, [email protected]@jandothttp://orcid.org/0000-0002-6416-2717

Page 2: Visual Analytics talk at ISMB2013

hypothesis-driven -> data-driven

Scientific Research Paradigms (Jim Gray, Microsoft)

I have an hypothesis -> need to generate data to (dis)prove it.I have data -> need to find hypotheses that I can test.

1st 1,000s years ago empirical

2nd 100s years ago theoretical

3rd last few decades computational

4rd today data exploration

Page 3: Visual Analytics talk at ISMB2013

What does this mean?

• immense re-use of existing datasets

• much of initial analysis is exploratory in nature

• biologically interesting signals may be too poorly understood to be analyzed in automated fashion

• visualization is very effective in facilitating human reasoning about complex data

• automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills)

Page 4: Visual Analytics talk at ISMB2013

What is visualization?

T. Munzner

Page 5: Visual Analytics talk at ISMB2013

Data visualization framework

Page 6: Visual Analytics talk at ISMB2013

Data visualization framework

interactivity

Page 7: Visual Analytics talk at ISMB2013

Data visualization framework

Page 8: Visual Analytics talk at ISMB2013

Data visualization framework

visual analytics infographics

Page 9: Visual Analytics talk at ISMB2013
Page 10: Visual Analytics talk at ISMB2013
Page 11: Visual Analytics talk at ISMB2013

“visual analytics”

Page 12: Visual Analytics talk at ISMB2013

• Types of interaction (Yi et al, IEEE Transactions on Visualization and Computer Graphics, 2007)

• select -> mark something as interesting

• explore -> show me something else

• reconfigure -> show me a different arrangement

• encode -> show me a different representation

• abstract/elaborate -> show me less/more detail

• filter -> show me something conditionally

• connect -> show me connected items

Page 13: Visual Analytics talk at ISMB2013
Page 14: Visual Analytics talk at ISMB2013

Visualization for biological hypothesis generation

• example: eQTL data (IEEE BioVis visualization challenge 2011)

• 500 patients (affected + non-affected)

• 7500 SNPs; gene expression data for 15 genes

• PLINK one-locus/two-locus

Page 15: Visual Analytics talk at ISMB2013

Aracari

Ryo Sakai

Bartlett C et al. BMC Bioinformatics (2012)

Page 16: Visual Analytics talk at ISMB2013

RevealJäger, G et al. Bioinformatics (2012)

Page 17: Visual Analytics talk at ISMB2013

HiTSeeBertini E et al. IEEE Symposium on Biological Data Visualization (2011)

Page 18: Visual Analytics talk at ISMB2013
Page 19: Visual Analytics talk at ISMB2013

when do I know that my algorithm is “correct”? -> peek into the black box

input

filter 1

filter 2

output A

filter 3

output B output C

Visualization for algorithm development

Page 20: Visual Analytics talk at ISMB2013

AB

C

Page 21: Visual Analytics talk at ISMB2013

AB

C

Page 22: Visual Analytics talk at ISMB2013

AB

C

Page 23: Visual Analytics talk at ISMB2013

Caleydo MatchMaker

Lex A et al. IEEE Transactions on Visualization and Computer Graphics (2010)

Page 24: Visual Analytics talk at ISMB2013

MeanderPavlopoulos et al. Nucl Acids Res (2013)

Georgios Pavlopoulos

Page 25: Visual Analytics talk at ISMB2013

ParCoordBoogaerts T et al. IEEE International Conference on

Bioinformatics & Bioengineering (2012)

Thomas Boogaerts

Endeavour gene prioritization

Page 26: Visual Analytics talk at ISMB2013
Page 27: Visual Analytics talk at ISMB2013

Visualization for (live) interaction with analysis

• alternating between visual and automatic methods -> continuous refinement and verification of preliminary results

• misleading results: discovered at early stage

• leverage user’s (biologist’s) insights

• no black box

Page 28: Visual Analytics talk at ISMB2013

CytoscapeSmoot et al. Bioinformatics (2011)

Page 29: Visual Analytics talk at ISMB2013

Data filtering (visual parameter setting)

TrioVis

Ryo Sakai

Sakai R et al. Bioinformatics (2013)

Page 30: Visual Analytics talk at ISMB2013

User-guided analysis

SparkNielsen et al. Genome Research (2012)

clustering

chromatin modification

DNA methylationRNA-Seq

data samples

regions of interest

Page 31: Visual Analytics talk at ISMB2013

BaobabViewvan den Elzen S & van Wijk J. IEEE Conference on

Visual Analytics Science and Technology (2011)decision trees

Page 32: Visual Analytics talk at ISMB2013

Goecks, J. et al. Nature Biotechnology (2012)

Galaxy TracksterGoecks J et al. Nature Biotechnology (2012)

Page 33: Visual Analytics talk at ISMB2013

Bret Victor - Ladder of abstration

Page 34: Visual Analytics talk at ISMB2013

Many challenges remain

• scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation

• infrastructure & architecture

• fast imprecise answers with progressive refinement

• incremental re-computation

• steering computation towards data regions of interest

Page 35: Visual Analytics talk at ISMB2013

Acknowledgments

• Bioinformatics Group at Stadius, Leuven University

• in particular: Ryo Sakai, Georgios Pavlopoulos

• visualization community for examples

• Jeremy for Trackster video