Information visualization: representation

Preview:

DESCRIPTION

 

Citation preview

06/03/14 pag. 1

Information visualization lecture 3

representation

Katrien Verbert Department of Computer Science

Faculty of Science Vrije Universiteit Brussel

katrien.verbert@vub.ac.be

06/03/14 pag. 2

Anscombe's quartet

Property   Value    

Mean  of  x     9    

Variance  of  x     11    

Mean  of  y     7.50    

Variance  of  y     4.122  or  4.127  

Correla8on  between  x  and  y   0.816  

Linear  regression  line  for  each  data  set  

y  =  3.00  +  0.500x    

06/03/14 pag. 3

Ben  Shneiderman    hIp://www.youtube.com/watch?v=og7bzN0DhpI  

(watch  12:20  –  15:49  )  

06/03/14 pag. 4

Anscombe's quartet

06/03/14 pag. 5

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 6

Part of this car purchase interface identifies a relation

Relations

10 - 12 12 - 14 16 - 18£kPrice

MPG 30 35 40

12 - 14

06/03/14 pag. 7

Interaction to identify a doctor highlights the hospital beds under his or her care, and vice versa: an example of brushing

     

Relations

06/03/14 pag. 8

Overview

•  Encoding of value – Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 9

A single number

The  original  aircraX  al8meter,  responsible  for  many  accidents  

06/03/14 pag. 10

Representation of the view of an altimeter

06/03/14 pag. 11

An altimeter representation easily assumed to be the same as shown on the previous slide

06/03/14 pag. 12

Change blindness

06/03/14 pag. 13

Change blindness

06/03/14 pag. 14

Change blindness

06/03/14 pag. 15

2000

1600

2200

182000

stop1200

1400

A modern aircraft altimeter

06/03/14 pag. 16 Source:  Image  by  kind  permission  of  Marcus  Watson  

Single number: second example

06/03/14 pag. 17

Each  dot  represents  the  price  of  a  car  

A collection of numbers

06/03/14 pag. 18

60

50

40

30

20

10

Price (£K)

Box plot

06/03/14 pag. 19

Box plot

06/03/14 pag. 20

1 –20 20–30 30–40 40–50 50–60

Price (£K)

2

4

6

8

1 –20 20–30 30–40 40–50 50–60

Price (£K)

2

4

6

8

histogram

06/03/14 pag. 21

10 - 12 12 - 14 16 - 18£kPrice

bargram

06/03/14 pag. 22

Nissan Ford Ferrari MG Cadillac

Bargram of categorical data

06/03/14 pag. 23

Monday   Tuesday   Wednesday   Thursday   Friday  

£100k  

£200k  

histogram of ordinal data

06/03/14 pag. 24

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 25

Anscombe's quartet

06/03/14 pag. 26

Scatterplot

06/03/14 pag. 27

Time series

   

Android  Ac8va8ons  per  day,  measured  on  the  first  of  each  month    

06/03/14 pag. 28

Time series

   

Android  Ac8va8ons  per  day,  measured  on  the  first  of  each  month    

06/03/14 pag. 29

Stock data

06/03/14 pag. 30

Four  views  of  a  8me-­‐series  query  tool.  (a)  An  overview  of  the  en8re  data  set;  (b)  a  single  8me-­‐box  limits  the  display  to  items  with  prices  between  $70  an  $250  during  days  1  to  4;  (c)  an  addi8onal  constraint  selects  items  with  prices  between  $70  and  $95  during  days  7  to  12;  (d)  yet  another  constraint  concerns  prices  between  $90  and  $115  for  days  15  to  18  Source:  Courtesy  of  Harry  Hochheiser  

(a)   (b)  

(c)   (d)  

time series

06/03/14 pag. 31

Overview of the entire data set

06/03/14 pag. 32

time-box limits the display to items with prices between $70 an $250 during days 1 to 4

06/03/14 pag. 33

additional constraint selects items with prices between $70 and $95 during days 7 to 12

06/03/14 pag. 34

yet another constraint concerns prices between $90 and $115 for days 15 to 18

06/03/14 pag. 35

Student activity meter

06/03/14 pag. 36

Representa8on  of  the  level  of  ozone  concentra8on  above  Los  Angeles  over  a  period  of  ten  years  

Time series

06/03/14 pag. 37

Price Number ofbedrooms

(a)

(b)

(c)

(d)

imposed limits

imposed limits

Linked histogram

the price and number of bedrooms associated with a collection of houses are represented by separate histograms a single house is represented once on each histogram;

06/03/14 pag. 38

Linked histogram

upper and lower limits placed on Price define a subset of houses which are coded red on both histograms

06/03/14 pag. 39

Linked histogram

Interpretation is enhanced by ‘ranging down’ the colour-coded houses, especially if exploration involves the dynamic alteration of limits

06/03/14 pag. 40

60  

50  

20  

10  

Price    (£K)  

40  

30  

40  

3  0  

35  

Ford  

Nissan  

VW  Merc  

Jag  

Jag  

Ford  

SEAT  

Semantic zoom reveals data about a second attribute

06/03/14 pag. 41

A  representa8on  of  Australia  and  New  Zealand  on  a  conven8onal  map  

Qualitative understanding of data

06/03/14 pag. 42

A  representa8on  of  Australia  and  New  Zealand  indica8ng  that  some  aIribute  of  New  Zealand  is  ten  8mes  its  value  for  Australia    

Australia

NewZealand

Qualitative understanding of data

In  the  State  of  the  World  Atlas,  magnifica8on  encoding  is  used  to  give  a  first  impression  of  popula8on  densi8es.  Note  the  reduced  ‘size’  of  Canada  and  Australia  when  compared  with  a  conven8onal  map  Source:  Smith  (1999)  

06/03/14 pag. 44

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 45

Price

Time

BedroomsAB

C

D

Does house A cost more than C?

06/03/14 pag. 46

A

B C

D

Price

Bedrooms

Scatterplot matrix

Interac8on  can  offer  solu8on    A  projec8on  of  the  data,  allowing  comparison  of  Price  and  Bedrooms  values  

06/03/14 pag. 47

Scatterplot matrix

06/03/14 pag. 48

The  highligh8ng  of  houses  in  one  plane  is  brushed  into  the  remaining  planes  

Cognitive overload? Interaction solution

06/03/14 pag. 49

A  representa8on  of  reported  product  failure,  based  on  month  of  produc8on  (MOP)  of  the  failed  product,  and  total  months  in  service  (MIS)  before  the  fault  occurred.  The  radius  of  each  circle  indicates  the  number  of  faults  reported  for  a  given  MOP  and  MIS  

MonthofProduction(MOP)

2 4 6 8 10 12Months in service (MIS)

July ʻ97

Sept ̒ 97

Nov ʻ97

Jan ʻ98

Mar ʻ98

May ʻ98

Trivariate data

06/03/14 pag. 50

 Circles  indicate  the  extent  of  the  effect  of  a  component  on  some  property  of  the  circuit,  and  change  in  size  as  the  frequency  cycles  up  and  down  the  range  from  bass  to  treble    

Treble

Bass

Trivariate data

06/03/14 pag. 51

A  representa8on  of  the  popula8on  of  major  ci8es  in  England,  Wales  and  Scotland.  Circle  area  is  propor8onal  to  popula8on      

Maps to represent trivariate data

06/03/14 pag. 52

Circles  change  in  size  as  the  decades  are  animated,  so  that  sudden  changes  in  popula8on  ‘pop  out’    

1900

1910

1920

1930

1940

1950

1960

1970

1980

1990

2000

Also non-static representations of data

06/03/14 pag. 53

hIp://www.youtube.com/watch?v=hVimVzgtD6w    

06/03/14 pag. 54

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data – Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 55

A  simple  scaIerplot  represen8ng  the  price  and  number  of  bedrooms  associated  with  two  houses    

   

A

B

Price

Numberofbedrooms

Simple scatterplot of bivariate data

06/03/14 pag. 56

An  alterna8ve  representa8on  to  the  scaIerplot  in  which  the  two  aIribute  scales  are  presented  in  parallel,  thereby  requiring  two  points  to  represent  each  house        

Price Numberofbedrooms

06/03/14 pag. 57 To  avoid  ambiguity  the  pair  of  points  represen8ng  a  house  are  joined  and  labelled    

Price Numberofbedrooms

A

BLabels

06/03/14 pag. 58

A  parallel  coordinate  plot  for  six  objects,  each  characterised  by  seven  aIributes.  The  trade-­‐off  between  A  and  B,  and  the  correla8on  between  B  and  C,  are  immediately  apparent.  The  trade-­‐off  between  B  and  E,  and  the  correla8on  between  C  and  G,  are  not  

A B C D E F G

Parallel coordinates

06/03/14 pag. 59

A  parallel  coordinate  plot  representa8on  of  a  collec8on  of  cars,  in  which  a  range  of  the  aIribute  Year  has  been  selected  to  cause  all  those  cars  manufactured  during  that  period  to  be  highlighted  Source:  Harri  Siirtola  

Parallel coordinates

06/03/14 pag. 60

Student activity meter

06/03/14 pag. 61

Sport

Literature

Mathematics

Physics

History

Geography

Art

Chemistry

Star plot

06/03/14 pag. 62

Bob’s  performance   Tony’s  performance  

Star plot for comparison

A  scaIerplot  enhanced  by  addi8onal  and  selec8ve  encoding,  allowing  the  selec8on  of  a  film  on  the  basis  of  type,  dura8on,  year  of  produc8on  and  other  aIributes  

The  automa8c  display  of  addi8onal  detail  following  the  selec8on  of  narrower  limits  on  years  of  produc8on  and  film  length  

06/03/14 pag. 65

   A  histogram  represen8ng  the  prices  of  a  collec8on  of  houses.  The  contribu8on  of  one  house  is  shown  in  yellow      

Histogram

06/03/14 pag. 66

Limits on Price identify a subset of houses, coded green

06/03/14 pag. 67

Houses  defined  by  the  limits  on  Price  are  coded  green  in  other  aIribute  histograms  

Linked histograms

06/03/14 pag. 68

 Green  coding  applies  only  to  houses  which  sa8sfy  all  aIribute  limits.    Houses  which  fail  one  limit  are  coded  black,  so  if  a  black  house  is  posi8oned  outside  a  limit  it  will  turn  green  if  the  the  limit  is  extended  to  include  it  

Linked histograms

06/03/14 pag. 69 Even  if  no  houses  sa8sfy  all  aIribute  limits,  black  houses,  which  fail  only  one  limit,  provide  guidance  as  to  the  effect  of  relaxing  limits  

Linked histograms

06/03/14 pag. 70

 An  AIribute  Explorer  representa8on  of  three  dimensions  of  communica8on  data  captured  during  an  emergency  services  exercise,  suppor8ng  interac8ve  explora8on  by  an  analyst        

Linked histograms

06/03/14 pag. 71

Linked histogram

Details  in  lecture  6:  case  studies  

06/03/14 pag. 72

Survived Age GenderClass

1st 2nd 3rd Crew

NoYesNoYesNoYesNoYes

Adult

Child

Adult

Child

Male

Female

118 57 0 5 4140 0 1

154 14 0 11 13 80 0 13

387 75 35 13 89 76 17 14

670192 0 0 3 20 0 0

Details of the Titanic disaster

2201 885706285325

First Second Third Crew

Female

Male

First Second Third Crew

Female

Male

Adult

First Second Third CrewChild

Survived

Died

Survived

Died

(a) (b)

(c)(d)

[Friendly,  2000]  

Steps  to  create  mosaic  plot  

06/03/14 pag. 74

Mosaic plot

06/03/14 pag. 75

Friendly’s webslte

hIp://www.datavis.ca/gallery/    

06/03/14 pag. 76 Chernoff  Faces  allow  aIribute  values  to  be  encoded  in  the  features  of  cartoon  faces  (Chernoff  1973)    

Icons

Michael  Porath  

Example

06/03/14 pag. 80

Some criticism

No evidence for pre-attentive nature [Morris et al. 1999]

Src:  hIp://joshualedwell.typepad.com/usability_blog/files/final_vizualiza8on.pdf    

06/03/14 pag. 81

Multidimensional icons representing eight attributes of a dwelling

house£400,000garagecentral heatingfour bedroomsgood repairlarge gardenVictoria 15 mins

flat£300,000no garagecentral heatingtwo bedroomspoor repairsmall gardenVictoria 20 mins

houseboat£200,000no garageno central heatingthree bedroomsgood repairno gardenVictoria 15 mins

06/03/14 pag. 82

Representa8ons  suppor8ve  of  object  visibility  

Object visibility: each object is represented as a single and coherent visual entity

06/03/14 pag. 83

Infocanvas

06/03/14 pag. 84 Representa8ons  of  mul8-­‐aIribute  objects  suppor8ve  of  aIribute  visibility  

06/03/14 pag. 85

Attribute correlation

06/03/14 pag. 86

Object correlation

06/03/14 pag. 87

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 88

Relation

Relation (n): a logical or natural association between two or more things; relevance of one to another; connection.

06/03/14 pag. 89

John Smith

MaryRobinson

A simple symbol indicates the relationship of marriage

06/03/14 pag. 90

Social networks

06/03/14 pag. 91

John Stingy Bank

1930 Bentley

Lines indicate relationship

06/03/14 pag. 92

YX1

X2

X3

Arrows indicate unique unilateral functional relations

y=f(x)    

06/03/14 pag. 93

Colour indicates a relation

06/03/14 pag. 94

The  incidence  of  warfare  in  early  Anglo-­‐Saxon  England  between  550  AD  and  700  AD.  Red  indicates  the  aggressor,  green  the  aIacked  

PictsNorthumbriaMerciaWest SaxonSouth SaxonIsle of WightKentBritons

Years AD

550 600 650 700

06/03/14 pag. 95

Insight  into  even  a  short  list  of  telephone  calls  (a)  is  enhanced  by  their  node-­‐link  representa8on  (b),  especially  if  disconnected  subsets  can  be  iden8fied  (c)  

(a) (b) (c)

Originator Receiver

ACIBFGIBKGKCD

HLMEHIBMBBEJC

AB

C

D

E

F

GI

J

K

L

M

H

B

EK

G I

M

A H

FJ C

L D

Lines

06/03/14 pag. 96

A  representa8on  of  mortgage  ac8vity:  (a)  lenders,  proper8es  (houses),  buyers,  etc.  are  represented  by  small  radial  segments  of  an  annulus  as  shown  in  (b),  and  their  rela8onships  denoted  by  straight  lines  

(a)  

(b)  

Useful?  

A  threshold  has  been  imposed  to  suppress  the  display  of  normal  behaviour.  As  a  result,  unusual  behaviour  is  revealed  by  the  paIerns  formed  by  the  lines    

06/03/14 pag. 98

hIp://seekshreyas.com/beerviz/    

06/03/14 pag. 99

hIp://visualiza8on.geblogs.com/visualiza8on/network/    

06/03/14 pag. 100

Chord diagram

06/03/14 pag. 101

06/03/14 pag. 102

An ‘association’ style chart depicting the African bombings

06/03/14 pag. 103

 Source:  Courtesy  i2  Ltd.  

Part of a ‘timeline’ style chart depicting the Kennedy assassination

06/03/14 pag. 104

Sankey diagram

hIp://bost.ocks.org/mike/sankey/    

06/03/14 pag. 105

Remember this one?

06/03/14 pag. 106

Flow map diagram

 Verbeek,  K.,  Buchin,  K.,  &  Speckmann,  B.  (2011).  Flow  map  layout  via  spiral  trees.  IEEE  transac8ons  on  visualiza8on  and  computer  graphics,  17(12),  2536-­‐2544.    

Migration from Colorado, migration from Norway and Latvia, whisky exports from Scotland.

06/03/14 pag. 107 Harry  Beck’s  original  London  Underground  map  Source:  ©  Transport  for  London  

Most familiar use of lines?

06/03/14 pag. 108

Source:  ©  Transport  for  London  

The Underground map in use prior to the introduction of Harry Beck’s map

Differences?  Easier  to  use?  

06/03/14 pag. 109

Journey time?

06/03/14 pag. 110

hIp://www.london-­‐tubemap.com/journey_8mes.php    

06/03/14 pag. 111

hIp://www.tom-­‐carden.co.uk/p5/tube_map_travel_8mes/applet/    

06/03/14 pag. 112

 The  social  choices  of  fourth  grade  students  (aXer  Moreno,  1934)  

     

Social networks

(a)  Social  choices  among  department  store  employees  (b)  Social  choices  among  department  store  employees,  with  marital  status  encoded  (c)  Social  choices  among  department  store  employees,  with  age  range  encoded  (blue  <30,  30  <yellow  <40,  red  >40)  Source:  L.C.  Freeman  

06/03/14 pag. 114

Overview

•  Encoding of value –  Univariate data –  Bivariate data –  Trivariate data –  Hypervariate data

•  Encoding of relation –  Lines – Maps and diagrams

06/03/14 pag. 115

Facili8es  offered  by  eight  hotels  

ABCDEFG

Swimming Pool

GolfCourse Restaurant

Hotels

Maps and diagrams

06/03/14 pag. 116

Swimming pool

Golf

Restaurant

A

B

C

D

E

F

G

Venn diagram

06/03/14 pag. 117

Figure  3.83  

Swimming pool Golf

Restaurant

A Venn diagram representation of the attributes of 24 hotels

06/03/14 pag. 118

The  development  leading  from  a  Venn  diagram  to  an  InfoCrystal.  The  InfoCrystal  illustrated  allows  visual  queries  to  be  made  concerning  price,  garden  size  and  number  of  bedrooms.  The  asterisk  represents  houses  sa8sfying  criteria  on  Price  and  garden  size  but  not  number  of  bedrooms    

Price

Garden size

Number of bedrooms

*

InfoCrystal

06/03/14 pag. 119

Swimming Pool

45

8

0

2

41

Golf

Restaurant

An Infocrystal representation of the hotel data

06/03/14 pag. 120

Cluster map

06/03/14 pag. 121 A  cluster  map  representa8on  of    24  hotels,  each  described  by  four  aIributes  Source:  Courtesy  ChrisLaan  Fluit,  Aduna  

Cluster map

06/03/14 pag. 122

TalkExplorer

Details  in  lecture  6:  case  studies  

06/03/14 pag. 123

designated root node

parent of A

sibling of A

child of A

leaf nodes

Aleaf nodes

Tree representations

06/03/14 pag. 124

Tree visualizations

hIp://www.informa8k.uni-­‐koeln.de/ls_juenger/research/vbctool/    

Problems?  

06/03/14 pag. 125

(a)  A  tree    (b)  The  corresponding  cone  tree  

(a)

(b)

Alternative: cone trees

06/03/14 pag. 126

Cam tree: horizontal orientation of cone tree

06/03/14 pag. 127

The  Tree  

The  Tree  Map  

Forma8on  of  the  Tree  Map  

Construction of a Tree Map

06/03/14 pag. 128

The  ‘slice-­‐and-­‐dice’  construc8on  of  a  Tree  Map  to  obtain  leaf  nodes  represented  by  rectangles  more  suited  to  the  inclusion  of  text  and  images      

Tree

Tree Map

Slide and dice construction

06/03/14 pag. 129 Source:  Courtesy  of  Ben  Shneiderman  

Tree map display of an author’s collection of reports

06/03/14 pag. 130

Map of the market hIp://www.marketwatch.com/tools/stockresearch/marketmap    

06/03/14 pag. 131

hIp://www.hivegroup.com/solu8ons/demos/usda.html    

hIp://www.ny8mes.com/interac8ve/2008/05/03/business/20080403_SPENDING_GRAPHIC.html?_r=0    

06/03/14 pag. 133

   

hIp://www.youtube.com/watch?v=og7bzN0DhpI  Watch  31:11  –  35:35  

   

Ben Sheiderman on tree maps

06/03/14 pag. 134

Tree map pros and cons

Pros?

Cons?

06/03/14 pag. 135

Tree map pros and cons

Pros

Color + Area (2 attributes)

Cons

Hierarchy/Structure hard to convey

aspect ratios

Slide  adapted  from  Michael  Porath    

06/03/14 pag. 136

Aspect ratios

Which  one  is  bigger?  

Slide  adapted  from  Michael  Porath    

06/03/14 pag. 137

Aspect ratios

Which  one  is  bigger?  

Slide  adapted  from  Michael  Porath    

06/03/14 pag. 138

Aspect ratios

Which  one  is  bigger?  

make  the  segments  more  square!    

Slide  adapted  from  Michael  Porath    

06/03/14 pag. 139

Layout Strategies / Algorithms

hIp://hcil2.cs.umd.edu/trs/2001-­‐06/2001-­‐06.html    

Cluster   Squarified   StripTreemap  

Pivot  By  Middle   Pivot  By  Size  

Slide  adapted  from  Michael  Porath    

06/03/14 pag. 140

Sunburst

hIp://bl.ocks.org/mbostock/4063423    

hIp://www.th

eguardian.com/new

s/datablog/2012/oct/05/beatle

s-­‐charts-­‐in

fographics  

 

06/03/14 pag. 142

hIp://hci.stanford.edu/jheer/files/zoo/    

06/03/14 pag. 143

 A  sketch  illustra8on  of  the  hyperbolic  browser  representa8on  of  a  tree.  The  further  away  a  node  is  from  the  root  node,  the  closer  it  is  to  its  superordinate  node,  and  the  area  it  occupies  decreases    

Hyperbolic tree

06/03/14 pag. 144

 (a)  The  repor8ng  structure  of  the  employees  of  a  company.  (b)  One  employee  of  interest,  

Rachel  Anderson,  has  been  moved  towards  the  centre,  revealing  her  subordinates    

Nodes can typically be moved into center position

Representa8on  of  the  Library  of  Congress  by  the  hyperbolic  browser  

hIp://ph

ilogb.gith

ub.io/jit/sta8

c/v20/Jit/

Exam

ples/Hypertree/example1.htm

l    

hIp://www.autod

eskresearch.com/projects/orgorgchart  

 

06/03/14 pag. 148

Readings

Chapter 3

06/03/14 pag. 149

Questions?

06/03/14 pag. 150

References

•  Christopher J. Morris, David S. Ebert, Penny Rheingans, An Experimental Analysis of the Pre-Attentiveness of Features in Chernoff Faces, Proceedings Applied Imagery Pattern Recognition, pp. 12–17, 1999.

•  Friendly, Michael. Visualizing categorical data. SAS Institute, 2000.

•  Chernoff, H. (1973). The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association, 68(342), 361-368.

06/03/14 pag. 151

project

06/03/14 pag. 152

Team project milestones

1.  Form teams 2.  Project proposal 3.  Intermediate presentation 4.  Final presentation 5.  Short report

due  27  Feb.  

due  13  March  

due  3  April  

22  May  

due  29  May  

06/03/14 pag. 153

Project proposal

1 page description of your intended project: –  mo8va8on  –  which  datasets  you  will  use  –  current  status.  If  available,  first  designs.  –  problems/ques8ons  

due 13 March If you want earlier feedback, send us your proposal earlier ;-)

06/03/14 pag. 154

Data collection

•  https://docs.google.com/forms/d/1gHwVWHZLzWdSz1F37jA1Gungrl56bT215M6FYW3YqGY/viewform Or

•  bit.ly/N6JTyD

Anonymous! Choose your own ID.

•  Please report your data ;-)

Recommended