Upload
taqi-shah
View
228
Download
0
Embed Size (px)
Citation preview
8/3/2019 Lec-22 B-Trees
1/20
B-Trees
8/3/2019 Lec-22 B-Trees
2/20
MotivationWh e n data i s too la rge to fit in th e main m e mo ry, the nthe numb er of di sk acc esses be com es impo r tant.
A di sk acc ess is unb e lie vabl y ex pe ns ive compa re d to a
typical comput er ins tr uction (m e chanical limitation s) .
On e disk acc ess is wo r th 200 ,000 comput er ins tr uction s .
The numb er of di sk acc esses will dominat e the r unnin g time .
8/3/2019 Lec-22 B-Trees
3/20
Motivation (contd. )Se conda ry me mo ry (disk) is divide d intoeq ual -s ize d bloc ks (typical s ize a re 512 , 2048 ,4096 , or 8192 b ytes) .
The ba s ic I/O op er ation t r an s f ers the cont e nts of on e disk bloc k to/f r om RAM.
Ou r goal i s to d e vise multi wa y se a r chtree that will minimiz e file acc ess ( b y ex plo r ing disk bloc k re ad ).
8/3/2019 Lec-22 B-Trees
4/20
Multi wa y se a r ch t rees (of o r der m)
A ge ner alization of Bina ry Se a r ch Trees .
Each nod e ha s at mo s t m child re n.If k m is the numb er of child re n, the n th e nod e
ha s ex actl y k-1 keys .
The tree is or dere d.
8/3/2019 Lec-22 B-Trees
5/20
8/3/2019 Lec-22 B-Trees
6/20
B-Trees A B-tree of o r der m is m-wa y se a r ch t ree .B-Trees a re balanc e d se a r ch t rees des igne d to
work we ll on di re ct acc ess se conda ry s tor a ge de vices .
B-Trees a re s imila r to R e d-Blac k Trees, but a re
be tter at minimizin g disk I/O op er ation s . All le av es a re at th e s am e le ve l.
8/3/2019 Lec-22 B-Trees
7/20
M
QTX
RS
8/3/2019 Lec-22 B-Trees
8/20
H e ight h = 42-le av es at d e pth 22-le av es at d e pth 31-le af at d e pth 4
8/3/2019 Lec-22 B-Trees
9/20
H e ight h = 26-le av es at d e pth 2
8/3/2019 Lec-22 B-Trees
10/20
B-Tree P r op er tiesB-Tree is a r oot e d t ree with root[T] with th e followin g pr op er ties:1 - Ev ery nod e x ha s the followin g fie lds .
a -n [ x ], the numb er of keys cu rre ntly s tore d in x .
b -The n [ x ] keys, the mse lves s tore d in non d e cre a s ing (As ce ndin g/Inc re a s ing) or der .
key 1[ x ] key 2 [ x ] key n [ x ].
c -L eaf [ x ], a Bool e an valu e that i s TRUE if x is le af , and fal se if x is inter nal nod e .
8/3/2019 Lec-22 B-Trees
11/20
P r op er ties Contd2 - if x is an int er nal nod e, it al s o contain s n [ x ]+1 point ers
to it s child re n. L e af nod e contain s no child re n.
3 -The keys key i [ x ] se pa r at e the r an ge of keys s tore d ine ach s ub t ree : if k 1 is an y key s tore d in th e s ub t ree with r oot c 1[ x], the n:
k1 key 1[ x ] k 2 key 2 [ x ] key n [ x ] [ x ] k n [ x ]+1
4 -Each l e af ha s the s am e de pth , which i s the he ight of the tree h .
8/3/2019 Lec-22 B-Trees
12/20
P r op er ties Contd5 -There a re lower and upp er bound on th e numb er of
keys a nod e can contain.
These bound s can b e ex presse d in t er ms of a fi xe dinteger t 2 , call e d th e minimum d egree of B-Tree .
Wh y t cant b e 1?
8/3/2019 Lec-22 B-Trees
13/20
P r op er ties Contda -Every nod e oth er than th e r oot mu s t hav e at l e a s t t -1
keys, Every inter nal nod e oth er than r oot , thu s ha s at
le a s t t child re n. If th e tree is non e mpt y, the r oot mu s thav e at l e a s t on e key .
b -Ev ery nod e can contain at mo s t 2 t -1 keys . There fore, an int er nal nod e can hav e at mo s t 2 t child re n. W e s a y a nod e is full if it contain s ex actl y 2 t -1 keys .
8/3/2019 Lec-22 B-Trees
14/20
He ight of a B-TreeWhat i s the ma ximum h e ight of a B-Tree with N
e ntr ies ?
This ques tion i s impo r tant , be cau se the ma ximumhe ight of a B-Tree will give an upp er bound on th e numb er of di sk acc esses .
8/3/2019 Lec-22 B-Trees
15/20
H e ight of a B-Tree
If n 1 , than fo r an y n -key B-Tree T of h e ight h andminimum d egree t 2 ,
$
21
logn
ht
8/3/2019 Lec-22 B-Trees
16/20
1
root[T]
t-1 t-1
t-1 t-1 t-1 t-1
t-1 t-1 t-1 t-1 t-1 t-1 t-1 t-1
# of nodes
1
2
2t
2t 2
t t
tt t t
A B-Tree of height 3 containing minimum possible keys
8/3/2019 Lec-22 B-Trees
17/20
P r oof Numb er of nod es is minimiz e d, whe n r oot
contain s on e key and all oth er nod es contain t -1 keys .
2 nod es at d e pth 1 , 2 t nod es at d e pth 2 ,
2 t 2 nod es at d e pth 3 and s o on.
At d e pth h , there a re 2 t h -1 nod es .
8/3/2019 Lec-22 B-Trees
18/20
P r oof( Contd. )Thu s numb er of keys (n ) s ati s fies the ineq ualit y:
$
$
u
u
u !
21
log
21
12
11
)1(21
2)1(11
1
nh
t n
t n
t
t t n
t t n
t
h
h
h
ih
i
8/3/2019 Lec-22 B-Trees
19/20
Num er ical E xampl eFor N= 2 ,000 ,000 (2 Million ), and m=100 ,
the ma ximum h e ight of a t ree of o r der mwill be onl y 3, wh ere a s a bina ry tree would
be of h e ight la rger than 20.
8/3/2019 Lec-22 B-Trees
20/20
Re adin gChapt er 19 B Trees of boo k Int r oductionto Algor ithm s By Thoma s H . Co r me n e t al