31
Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Embed Size (px)

Citation preview

Page 1: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

1

Dynamic Data Structures:Orthogonal Range Queries

and Update Efficiency

Konstantinos Tsakalidis

PhD Defense23 September 2011

Page 2: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

2

Κωνσταντίνος Τσακαλίδης

2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece

Sum. 2007 InternGoogle Inc., Mountain View, California, USA

2007-2009 Ph. D. Student (Part A)MADALGO, Aarhus University, Denmark

Sum. 2010 Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada

2009-2011 Ph. D. Student (Part B)

Page 3: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

3

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 4: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

4

Databases and GeometryName Age Salary Date Phone …

Andreas 30 5.500 2/2010 555-4321 …

Maria 6.500 4/1998 555-3214 …

John 25 3.000 5/2011 555-2143 …

Helen 34 4.000 1/2000 555-1432 …

Jacob 28 7.000 11/1989 555-1234 …

Planar (D=2) Euclidean Space

38

Query Operation• Question about stored dataUpdate Operation/Transaction• Insert/Delete Tuple• Change Value

N points D dimensions

29

Salary

Age

Date

Name

Phone

Page 5: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

5

Models of Computation

Pointer Machine

Record

O(1) fields

word-RAM I/O Model[Aggarwal, Vitter ‘88]

Space

w bits/cell

O(1) Time

N M<NN

B

B words

N/B

M/B

I/O Operation

#Occupied Records

#Arithmetic Operations +#Pointer TraversalsTime

#Occupied Cells

#Arithmetic Operations+#cell READ/WRITEs

#Occupied Blocks

#I/O Operations

specialized database

Memory Disk

Page 6: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

6

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 7: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

7

Orthogonal Range Reporting Queries

Salary

Age

1000

Contour Query Report all points with: Salary > 1000

Dominance Query Report all points with: Salary > 1000 and Age > 35

35

2000

3-Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35

Employees

Page 8: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

8

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

[ICDT ’10] Amortized Expected w.h.p.

[ICDT ’10]

Expected w.h.p.

Amortized Expected w.h.p.

[ISAAC‘09]Expected w.h.p. Amortized Expected

[ISAAC ’09]Expected w.h.p.

Expected amortized

Worst-Case EfficientDynamic 3-Sided Range Reporting

word-RAM Space Query Time Update Time

Fusion Tree [Willard’00]

[Mortensen’06]

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

Space Query Time Update Time

Priority Search Tree[McCreight’85]

Pointer Machineword-RAM

[ICDT ’10] Expected w.h.p.

[ICDT ’10]Expected w.h.p.

Expected w.h.p.

X, Y: μ-random

X: smoothY: restrictedX: smooth

X, Y: μ-random

X: smoothY: restricted

X: smooth

Average-Case EfficientDynamic 3-Sided Range Reporting

Page 9: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

9

Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution

Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth

Restricted class of distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions

Probabilistic Distributions

Smooth

Restricted

Page 10: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

10

Priority Search Tree [McCreight’75]

Move UpMaximum Y

Space: O(n) Update:

Update: O(log n)

Pointer Machine

Page 11: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

11

Query by X-Coordinate: logn + t

PathSubtreesInX( s)

Pointer Machine

O(logn)

Page 12: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

12

Query by Y-Coordinate: logn + t

u

ul

ur

[Alstrup, Brodal, Rauhe ‘00]1D Range Maximum Queries (Children)

uFind next pointto be reportedin O(1) timeO(1) time

Pointer Machineword-RAM

Page 13: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

13

[ISAAC ‘09]

Update:O(log log n) exp. amo.Query: O(log log n+t) exp. w.h.p.Space: O(n)

Weighti=Θ(22i)

O(loglogn) expected w.h.p.[Mehlhorn, Tsakalidis ’93,Kaporis et al. ’06]

[Anderss

on, Thoru

p ‘07]

RMQ

O(1) expected amortized

word-RAM

Page 14: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

14

I/O Model Space Query I/Os Update I/Os

[ISAAC‘09]Expected w.h.p. Amortized Expected

Average-Case EfficientDynamic 3-Sided Range Reporting

Space Query Time Update Time

[ISAAC ’09]Expected w.h.p.

Expected amortized

word-RAM

X: smooth

Page 15: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

15

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 16: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

16

Orthogonal Range MAXIMA Reporting QueriesOR “Generalized Planar SKYLINE Operator”

Dominance Maxima QueriesReport all maximal points among

points with x in [xl,+∞) and y in [yb,+∞)

Contour Maxima QueriesReport all maximal points among points with x in (-∞, xl]

3-Sided Maxima QueriesReport all maximal points among

points with x in [xl, xr] and y in [yb,+∞)

Salary

Age

Employees

4-Sided Maxima QueriesReport all maximal points among

points with x in [xl, xr] and y in [yb,yt]

Interesting Points Oldest and Best PayedMaximal Point

Dominates:Is “Above”

Is NOTDominated

xl

yb

xl

yb

xr

yb

xl xl xr

yb

yt

Page 17: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

17

Worst-Case EfficientDynamic Range MAXIMA Reporting

Pointer Machine Insert Delete

Overmars, van Leeuwen ‘81 logn + t - log2n log2n

Frederickson, Rodger ‘90 logn + t log2n+tlogn(1+t)

logn log2n

Janardan ‘91 logn + t logn + t logn log2n

Kapoor ‘00 logn + t amo. - logn logn

[ICALP ’11] logn + t logn + t logn logn

word-RAM Insert Delete

[ICALP ’11]

Page 18: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

18

Tournament Tree

Copy UpMaximum Y

Y-Winning Paths

Pointer Machine

Page 19: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

19

Tournament Tree

Right(u)MAX( )uPointer Machine Find next point

to be reportedin O(1) time

Page 20: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

20

3-Sided Range Maxima Queries

Query Time: log n + tMAX( )

Pointer Machine

Subtrees(Paths)

O(logn)

Page 21: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

21

Update OperationPointer MachinePrevious Update: O(log2n)

Page 22: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

22

U

URUL

Update OperationPointer Machine

MAX(Right(uR))

MAX(Right(u))

MAX(Right(uL))[Sundar ‘89]Priority Queue with AttritionO(1) time

Page 23: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

23

Reco

nst

ruct R

ollb

ack

Update OperationPointer Machine

Partially Perstistent Priority Queue with Attrition

O(1) time, space overhead per update step

[Brodal ‘96]

worst case

[Driscol et al. ‘89]

amortized

Space:O(n)Update:O(logn)

Page 24: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

24

[ICALP ‘11]

[ICALP ’11] Space Insert Delete

Pointer Machine n logn+t logn logn

word-RAM n

Pointer Machine nlogn log2n+t log2n log2n

[ICALP ’11] Space Insert Delete

Page 25: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

25

Rectangular Visibility Queries

4x

(+∞,+∞)

(+∞,-∞)

(-∞,+∞)

(-∞,-∞)

Proximity Queries/Similarity Search

4-Sided Range Maxima Queries

Page 26: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

26

Worst-Case Efficient4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries

Pointer Machine Space Insert Delete

Overmars, Wood ‘88 nlogn log2n+t log2n log3n

Overmars, Wood ‘88 nlogn log2n +t logn log2n log2n

[ICALP ’11] nlogn log2n+t log2n log2n

Page 27: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

27

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 28: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

28

B-Trees [Bayer,McCreight ‘72]Name Age Salary …

Andreas 30 5.500 …

Maria 38 6.500 …

John 25 3.000 …

Helen 34 4.000 …

Jacob 28 7.000 …

Indexed Database

Space: O(N/B) blocksUpdate:O(logBN) I/Os

Access: O(logBN) I/Os

Multi-Versioned Databases

Btrfs

Data Platform

Page 29: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

29

Fully Persistent B-Trees

I/O Model Space Query I/Os Update I/Os Amortized

Lanka, Mays ‘91 n/B (logBn + t/B)logBm logBn logBm

[SODA ’12] n/B logBn + t/B logBn + log2B

n elements in one versionm update operations = #versionsB block size

Page 30: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

30

[SODA ‘12]

Incremental B-Trees Lazy Updates

O(logBN) READs O(1) WRITEs that make

O(1) changes to a block

ResultSpace O(N/B)Query O(logBN+t/B) I/Os

Update O(logBN + log2B) I/Os

I/O-Efficient Full Persistence Interface of Primitive Operations

READ WRITE

Input is a pointer-based Structure Node occupies O(1) blocks Node has indegree O(1)

O(1) I/O-Overhead per access to a block O(log2B) I/O-Overhead per change

to a block [Driscol et al.’89] Node-Splitting Method

ACCESS NEW_NODE

NEW_VERSION

Page 31: Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis

31

Mange Tak

Konstantinos TsakalidisPh.D. Student

[email protected]

Tsakalidis K., et al.[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”[ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries”

[SODA ‘12] “Fully Persistent B-Trees”