Upload
kami
View
37
Download
0
Embed Size (px)
DESCRIPTION
Approximate Counting of Frequent Query Patterns over XQuery Stream. Liang Huai Yang, Mong Li Lee, Wynne HSU DASFAA 2004 Speaker:M ing Jing Tsai. Introduction. Efficient approach to improve XML management system Cache frequently retrieved results Frequent query patterns application - PowerPoint PPT Presentation
Citation preview
2004/5/28 1
Approximate Counting of Frequent Query Patterns over XQuery Stream
Liang Huai Yang, Mong Li Lee, Wynne HSUDASFAA 2004
Speaker:Ming Jing Tsai
2
Introduction
Efficient approach to improve XML management system Cache frequently retrieved results Frequent query patterns
application Search engine XML query system
3
Preliminaries
S = QPT1,QPT2,…,QPTN
Query pattern trees(QPT) Label:{“*”,”//”} ∪tagset
Rooted subtree(RST) root(RST) = root(QPT) RSTV’ QPTV , RSTE’ QPTE
4
QPT
book
title author price
book
title
author
price
fn ln
book
title
section
QPT1 QPT2 QPT3
book
title author price
RST
5
Approximate Counting
rst.count app ≧ (σ-ε)N rst.count app ≧ rst.counttrue-Εn XQuery stream divided into buckets of
w = bcurrent =
N
w
1
6
D-GQPT
1
3 62
book
title
author
54fn ln
7
8section
price
titleRST3
book1
3 82title author price
book
title author price
1,2,-1,3,-1,8,-1
7
D-GQPT
1
3 62
book
title
author
54fn ln
7
8section
price
titleRST3
book1
3 82title author price
book
title author price
1,2,-1,4,-1,9,-1
8
ECTree1
1
2
1
3
1
6
1
8
1
2 8
1
2 6
1
2 3
Gjoin
Grmlne =
1
3 8
1
3 6
GjoinGrmlne
1
4
3
1
5
31
6 8
GjoinGrmlne
1
7
6
Gjoin
Grmlne =
1
4 5
3
1
3 6
4
1
3 8
4
1
3 6
7
GjoinGrmlne
1
3 6 8
9
Candidate Generation
Rightmost active leaf node expansion Grmlne( )=
Gjoin ( )= | = X
j = i+1,…,N
1kRST
ir
kRST
i
kRST
i1k
RSTij
1kRST
ij
kRST
i
kRST
j
10
Prune
RSTK+1 doesn’t exist in ECTree RSTk+1.Δ = bcurrent - β | RSTK+1.tidlist| < β prune
RSTK+1 exists in ECTree RSTK+1.countapp = RSTK+1. countapp+|RSTK+1.tidlist| RSTK+1.countapp + RSTk+1.Δ < bcurrent prune
Join result with RSTK+1
subtree induced by RSTK+1
11
AppXQSMiner
12
AppXQSMiner
13
ECTree1
1
2
1
3
1
6
1
8
1
2 8
1
2 6
1
2 3
Gjoin
Grmlne =
1
3 8
1
3 6
GjoinGrmlne
1
4
3
1
5
31
6 8
GjoinGrmlne
1
7
6
Gjoin
Grmlne =
1
4 5
3
1
3 6
4
1
3 8
4
1
3 6
7
GjoinGrmlne
1
3 6 8
14
Experiment
P4 2.4GHz, 1GB RAM, WINXP DBLP DTD:98 nodes Shakespears’ Play DTD: 23 nodes
15
Experiment error=0.1σ
16
Experiment error = 0.1σ
17
Experiment sup = 0.005
18
Experiment sup = 0.005
19
Experiment error = 0.05σ
20
Experiment error = 0.05σ