View
40
Download
0
Category
Preview:
DESCRIPTION
2007.MScThesis
Citation preview
TP. H Ch Minh 02/2007
I HC QUC GIA THNH PH H CH MINH TRNG I HC BCH KHOA
WX
NGUYN VIT HNG
NHNG VN BO MT KHI TRUY VN C S D LIU XML NG C
OUTSOURCED
CHUYN NGNH: CNG NGH THNG TIN M S NGNH: 60.48.01
LUN VN THC S
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 2/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
CNG TRNH C HON THNH TI TRNG I HC BCH KHOA
I HC QUC GIA THNH PH H CH MINH
Cn b hng dn khoa hc: Tin s NG TRN KHNH
Cn b chm nhn xt 1: Tin s NGUYN C CNG
Cn b chm nhn xt 2: Tin s TRN VN HOI
Lun vn thc s c bo v ti HI NG CHM BO V LUN VN
THC S TRNG I HC BCH KHOA, ngy 03 thng 02 nm 2007
TRNG I HC BCH KHOA CNG HA X HI CH NGHA VIT NAM PHNG O TO SH C LP T DO HNH PHC
Tp. HCM, ngy . . . . thng . . . . nm 200. .
NHIM V LUN VN THC S H tn hc vin: Nguyn Vit Hng Phi: Nam Ngy, thng, nm sinh: 14 thng 01 nm 1981 Ni sinh: Kin Giang Chuyn ngnh: Cng ngh thng tin MSHV: 00703170 I- TN TI: Cc vn bo mt trong vic truy vn CSDL XML ng c outsourced. II- NHIM V V NI DUNG:
- Tm hiu tng quan cc vn lin quan bo mt CSDL c outsourced. - Tm hiu cc nghin cu lin quan kha cnh Query Assurance. - xut gii php kim tra query assurance cho CSDL XML c outsourced. - Xy dng chng trnh hin thc gii php, o c v nh gi gii php ra.
III- NGY GIAO NHIM V : ..................................................................................... IV- NGY HON THNH NHIM V: ...................................................................... V- CN B HNG DN: Tin s ng Trn Khnh. CN B HNG DN CN B MN (Hc hm, hc v, h tn v ch k) QL CHUYN NGNH
Ni dung v cng lun vn thc s c Hi ng chuyn ngnh thng qua. Ngy thng nm 2006 TRNG PHNG T SH TRNG KHOA QL NGNH
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 4/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
ACKNOWLEDGEMENT
I would like to express my gratefulness
To my mom and dad who has brought me up and done everything for my life;
To my advisor, Dr. DangTran Khanh, who has advised me with all his heart;
To my friends who are always in my side, and especially, to my colleagues
who are willing to help me complete some parts of the work.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 5/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
ABSTRACT
With the impressive improvement of the network technologies, database outsourcing
is emerging as an important trend beside the application-as-a-service. In this model,
data owners ship their data to external service providers. Service providers do data
management tasks and offer their clients a mechanism to manipulate outsourced
database. Since a service provider is not always fully trusted, security and privacy of
outsourced data are important issues. These problems are referred as data
confidentiality, user privacy, data privacy and query assurance. Among them, query
assurance takes a crucial role to the success of the database outsourcing model. To the
best of our knowledge, however, query assurance, especially for outsourced XML
database, has not been concerned reasonably in any previous work.
In this paper, we propose a novel index structure, Nested Merkle B+ Tree, combining
the advantages of B+ tree and Merkle Hash Tree to completely deal with three issues
of query assurance known as correctness, completeness and freshness in outsourced
XML database. Experimental results with real dataset prove the effeciency of our
proposed solution.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 6/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
TM TT
Vi s pht trin vt bc trong lnh vc cng ngh mng cho ra i nhiu dch v
t xa, c bit l s ra i ca dch v application as a service. Dch v ny gip
cho mi ngi c th tip cn mt cch hp php vi cc phn mm mi nht vi mt
chi ph thp nht. Thi gian gn y, xut hin xu th mi cho php lm gim chi ph
v qun l d liu qua mt dch v gi l database outsourcing. Vi dch v ny,
cc n v, t chc lu tr thng tin, d liu ca mnh ti my ch ca cc nh cung
cp dch v. Cc nh cung cp dch v s m nhn cc cng tc bo tr my ch, bo
tr phn mm DBMS cng nh bo tr CSDL ca khch hng. Bn cnh , h cung
cp cc c ch cho php cc n v, t chc c th thao tc trn CSDL ca mnh. Tuy
nhin, thng tin vn l mt ti sn ht sc qu bu, nn cc n v hon ton khng
th tin cy c cc nh cung cp dch v trong vic m bo an ton cho CSDL. Do
pht sinh cc yu cu bo mt v CSDL outsourced. Cc vn c th tm
gn trong bn yu cu bo mt, bao gm: data confidentiality, data privacy, user
privacy v query assurance.
Ngoi phn gii thiu tng quan v cc kt qu t c trong lnh vc data
outsourcing, ti liu a ra mt cu trc ch mc mi cho d liu XML. Da trn cu
trc ny, ti liu trnh by phng php m bo truy vn cho CSDL XML
outsourced cng nh mt s kt qu thc nghim hin thc cho phng php ny.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 7/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
MC LC
ACKNOWLEDGEMENT ................................................................................................................ 4ABSTRACT ................................................................................................................................. 5
Chng 1 GII THIU ..................................................................................................... 81.1 Data Confidentiality ............................................................................................ 121.2 User Privacy v Data Privacy ............................................................................. 131.3 Query Assurance ................................................................................................. 171.4 Nhn xt .............................................................................................................. 19
Chng 2 CC NGHIN CU LIN QUAN ............................................................... 222.1 Khi nim ............................................................................................................ 222.2 Hng tip cn dng ch k in t ................................................................... 232.3 Hng tip cn s dng cu trc d liu c bit ............................................... 252.4 Hng tip cn Challenge Response. .............................................................. 282.5 Hng tip cn da vo c th ca bi ton ..................................................... 302.6 Bo m truy vn cho d liu dng cy .............................................................. 312.7 Nhn xt .............................................................................................................. 33
Chng 3 D LIU XML ............................................................................................... 353.1 M hnh lu tr ................................................................................................... 353.2 Ch mc cho ti liu XML .................................................................................. 40
Chng 4 M BO TRUY VN ................................................................................. 424.1 Phng php ....................................................................................................... 424.2 Nested B+ Tree ................................................................................................... 434.3 Tc v chn ......................................................................................................... 454.4 Cc tc v cp nht d liu ................................................................................. 49
Chng 5 PHN TCH .................................................................................................... 51Chng 6 THC NGHIM ............................................................................................. 58Chng 7 KT LUN ...................................................................................................... 63Chng 8 PH LC ......................................................................................................... 67
8.1 Cu trc lu tr XML ......................................................................................... 678.2 Gii thut gn nhn (labeling) ............................................................................. 678.3 Chng trnh th nghim .................................................................................... 688.4 Lc ti liu mondial.xml .............................................................................. 718.5 K hoch thc thi truy vn .................................................................................. 728.6 Tm lc cc nghin cu lin quan .................................................................... 738.7 Bi bo lin quan ................................................................................................ 83
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 8/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 1
GII THIU
Thng tin l mt ngun ti nguyn rt quan trng trong mi t chc. Qun l v x l
thng tin hiu qu v ang tp trung s quan tm ca mi ngi. Vi s ra i ca
my tnh in t (eclectronic computer) v cc my tnh c nhn (personal computer
PC), ngnh khoa hc my tnh mang n k nguyn mi, k nguyn ca thng
tin, tc ng mnh m n mi lnh vc trong i sng.
D liu c lu tr thnh cc cc c s d liu (CSDL), thng thng, c t
trong ni b t chc (in-house database). iu ny i hi mi t chc phi u t
mt khon chi ph cho vic qun l h thng CSDL, bao gm: thit b phn cng
(my mc, h thng mng), phn mm (h qun tr CSDL DBMS, cc chng trnh
ng dng c th,), nhn s (nhn vin qun tr mng, nhn vin qun tr CSDL,).
Cng vi s pht trin ca x hi ni chung v t chc ni ring, nhu cu lu tr v
x l ngy cng gia tng v phc tp hn. Nhng yu cu ny lm tng tng chi ph
trong qun l. Mc d, gi thnh phn cng gim rt nhiu, nhng chi ph bn
quyn phn mm, chi ph cho i ng nhn vin qun tr c trnh cao qun l
cc h thng thng tin ngy mt phc tp tht s l mt vn ng quan tm trong
tng chi ph s hu (total cost of ownership) ca t chc. iu ny c bit quan
trng i vi cc t chc va v nh, t chc phi li nhun,
Trong nhng nm gn y, s tin b vt bc trong cng ngh mng v truyn thng
cho ra i h thng mng tc cao, bng thng rng, khai sinh ra khi nim
application as a service. Ngi dng ch cn phi tr mt khon ph nh cho nh
cung cp dch v l c th s dng c cc phn mm mi m khng cn phi quan
tm n chi ph bn quyn, chi ph ci t v bo tr h thng.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 9/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Bn cnh , mt dch v khc cng dn c hnh thnh, l database as a
service, cung cp cho ngi dng ni lu tr v truy xut d liu ch vi mt chi ph
thp, m khng cn phi mua sm thit b, cng nh i hi phi c i ng chuyn
trch. iu ny s gip gim ng k chi ph qun l thng tin cho cc t chc.
Hnh 1.1. M hnh Database as a Service.
Trong m hnh database as a service, ngi s hu d liu (data owner DO) t CSDL ca mnh ti nh cung cp dch v (service provider SP) cho cc khch hng (clients, queriers C, Q) thc hin cc tc v trn CSDL nh select, insert update. M hnh cn c gi l outsourced database services (ODBS).
Thng tin l ti sn quan trng ca t chc. Vic t CSDL lu tr cc thng tin
mt ni khng tin cy bn ngoi t chc (nh cung cp dch v) lm ny sinh cc
vn bo mt. Chnh nhng vn ny s quyt nh tnh kh thi ca Dch v CSDL
outsource (outsourced database services ODBS). Cc CSDL outsourced phi c
m bo an ton, ngn cm s truy cp ca cc t chc/c nhn khng c thNm quyn,
k c nh cung cp dch v. Khi , chnh nh cung cp dch v tr thnh i tng
nguy him nht trong vic m bo bo mt ca d liu. Do cc xm nhp t bn
ngoi, cao nht, cng ch t c kh nng truy cp h thng nh cc nh cung cp
dch v. V vy cc nghin cu ch yu tp trung vo vic ngn chn hnh vi xm
nhp ca chnh cc nh cung cp dch v (service provider SP).
V mt c bn, vn bo mt CSDL ti cc SP c th chia thnh bn lnh vc nh
sau [1]:
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 10/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Data confidentiality tnh ni b ca d liu. Ch s hu d liu (data owner
DO) khng mun nhng ngi khc khng c thNm
quyn c kh nng truy cp CSDL ca mnh, k c cc
SP.
User privacy tnh ring t ca ngi dng. Thng tin l hng ha.
Do n c th s c bn cho cc cng ty khc. Cc
cng ty khch hng khng mun l nhng thng tin
m h khai thc, k c i vi DO v SP.
Data privacy tnh bo mt d liu. DO khng mun khch hng ca
mnh c th khai thc c nhiu hn nhng thng tin
m h c php khai thc.
Query Assurance tnh bo m truy vn. Khch hng (Client) phi c
m bo ra d liu m mnh nhn c l chnh xc,
y v mi nht t CSDL nguyn thy do DO cung
cp, m khng b nhng thay i ngoi mun.
Bng 1.1. Cc vn bo mt trong ODBS.
Song song vi vic m bo cc yu cu bo mt, ta cn phi quan tm n hiu nng
thc hin truy vn (performance) cng nhng kh nng m rng ca CSDL
(scalability, usability).
m bo data confidentiality, d liu c m ha trc khi c outsourced.
Tuy nhin iu ny lm tng tnh phc tp ca vic x l cc truy vn trn d liu m
ha m vn phi m bo cc yu cu bo mt khc.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 11/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Hnh 1.2. M hnh ODBS
Trong m hnh ODBS, data owner t CSDL ca mnh ti cc server bn ngoi (SP) v thc hin truy vn lu tr thng qua ng truyn mng bo mt. Clients tr chi ph cho data owner c quyn truy cp d liu, v thc hin truy cp d liu trc tip t SP cng thng qua ng truyn bo mt.
Trong thc t, khng phi lc no cng cn thit phi m bo tt c cc yu cu bo
mt trn. Ty thuc vo tnh hung m mt s yu cu c th c b qua nhm gim
thiu mc phc tp tng hiu nng x l ca h thng. Quay tr li m hnh ca
ODBS, ta c bn m hnh bo mt nh sau [1].
- M hnh UP-DP (User privacy Data privacy): trong m hnh ny DO ng
thi l ngi cung cp dch v SP. DO bn thng tin t CSDL ca mnh cho
cc khch hng khc. y chnh l m hnh CSDL in-house truyn thng.
Do , m hnh ny ch quan tm n user privacy v data privacy.
- M hnh UP-nDP (User privacy non Data privacy): m hnh ny tng t
m hnh trn, ch khc l d liu c bn l ph bin, khng cn phi bo mt
d liu. Ch cn che du nhng g m ngi dng ly t CSDL.
- M hnh DC-UP (Data confidentiality User privacy): trong m hnh ny DO
ng thi l khch hng duy nht ca h thng. y l m hnh kh ph bin.
Cng ty thu nh cung cp dch v lu tr d liu ni b ca mnh v thc hin
truy cp trn CSDL ny. Do , ch xem xt confidentiality v user privacy.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 12/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- M hnh DC-UP-DP: y l m hnh y v phc tp nht. DO thu nh
cung cp dch v lu tr CSDL ca mnh. ng thi, thc hin bn thng tin
cho cc khch hng khc. DO cn c m bo data confidentiality v data
privacy trong khi ngi dng cn c m bo user privacy.
Trong tt c cc m hnh trn, query assurance lun l mt vn cn c quan
tm v xem xt.
Phn tip theo im qua cc nghin cu cng nh cc kt qu lin quan n cc vn
bo mt trong bng 1.
1.1 Data Confidentiality Data Confidentiality l yu cu m bo CSDL khng b truy cp bt hp php, k c
cc SP. t c yu cu ny, CSDL thng c m ha trc khi outsourced.
Tuy nhin, chnh vic m ha ny lm gia tng s phc tp trong truy vn d liu, nh
hng rt nhiu n hiu nng ca CSDL. Vic la chn c ch m ha c th dung
ha gia nhu cu bo mt v yu cu v hiu nng l rt cn thit. Hin nay, c ch
m ha kha b mt i xng (symmetric private key encryption) thng c s
dng, chng hn nh gii thut Rijndael, DES, TripleDES,....
Thc thi truy vn trn d liu m ha Hacigm [5] xut mt gii php thc thi cy truy vn trn d liu m ha.
tng chnh ca gii php ny l tch cu truy vn thnh hai phn: mt phn s c
thc thi ti server, phn cn li s c thc thi ti client.
Kenny C.K.Fong [6] ch ra nm l hng bo mt nghim trng ca phng php
ny:
1. Khng tha mn tnh bo mt v mt ng ngha (semantically secure). Tnh
cht ny khng tha mn nu ta c th tm c hai thng ip m0 v m1 m c
th on c kt qu m ha l ca m0 hay m1 vi xc sut > .
2. N u min gi tr ca cc trng d liu nh v ri rc, th hm bm ca gii
thut khng m bo an ton.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 13/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
3. Gii thut cha m bo tnh xc thc ca kt qu truy vn tr v.
4. Cha che du cu truy vn mt cch hon ho. Server c th on bit c
loi truy vn m ngi dng thc hin.
5. Thc hin m ha theo record, do , phi gii m theo record. V vy, ngi
dng c th bit nhiu thng tin hn l h c php (khng tha mn data
privacy).
Tm kim d liu m ha trn d liu XML R. Brinkman [9] gii thiu mt cch thc cho php tm kim so trng cc tag ca mt
ti liu XML c m ha da trn gii thut Linear Search Strategy for Full Text
Documents (1), gi l Tree Search Strategy for XML Documents (2).
Gii thut (1) chia lm 3 giai on: lu tr (storage), tm kim (search), nhn d liu
(retrieval). giai on lu tr, ton b d liu c chia thnh nhiu khi nh c
nh, sau thc hin m ha cc khi ny trc khi lu tr trn server. DO cn phi
ghi nhn mt s thng tin v m ha c th gii m sau ny. Do , gii thut ny
ch ph hp vi m hnh DP-UP. giai on tm kim, chui d liu cn tm s c
m ha v chuyn n cho server so trng trn cc khi d liu xc nh ra v tr
ca on d liu m ha kt qu. Giai on nhn d liu, kt qu m ha s c gii
m da theo cc thng tin m ha c ghi nhn ti giai on lu tr.
Gii thut (2) c xy dng da trn gii thut (1). Tuy nhin, d liu l mt ti liu
XML thay v file text phi cu trc. Kch thc ca cc khi chia ra cng khng u
nhau m phc thuc vo kch thc ca tng node (hay mi node l mt khi). (2) ch
p ng cc cu truy vn dng tm kim so trng cc tag name trong ti liu XML m
khng x l n ni dung d liu bn trong node.
1.2 User Privacy v Data Privacy N gi s dng CSDL yu cu h thng phi m bo user privacy/data privacy v
yu cu truy vn cng nh kt qu tr v. SP, v k c DO, khng c php bit cc
thng tin ny.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 14/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Mt khc, ngi dng ch c php truy vn nhng g m h c php. Kt qu tr
v ch gii hn trong phm vi thng tin m h yu cu. N gi dng khng c php
thy cc d liu khng thuc thNm quyn ca mnh.
PIR-like protocols t c user-privacy, Chor gii thiu giao thc PIR (private information
retrieval). V mt l thuyt, PIR cho php ngi dng c th che du cu truy vn v
kt qu tr v. Tuy nhin, CSDL cn phi c nhn bn (replicate) sang nhiu ni.
N u khng, chi ph phi tr l rt ln (c th cn ly v ton b CSDL ti client). Mc
d vy, ngay c khi c nhn bn, chi ph phi tr cng ln PIR khng th
p dng vo thc t.
PIR ch dng truy xut d liu ch c (read-only). N otably v Ostrovsky pht
trin gii thut PIR cho php h tr cc thao tc cp nht d liu m bo user
privacy, giao thc PIS (private information storage).
ng dng PIR/PIS vo thc t, Asonov ci tin PIR thnh giao thc RIR
(repudiative information retrieval). RIR gim bt mt s rng buc bo mt gim
bt chi ph I/O m vn m bo user privacy. Tng t, giao thc RIS ci tin t PIS
vi chi ph thp v kh thi hn.
Tt c cc giao thc trn u c xy dng da trn nn tng ca PIR, do , chng
cn c gi l cc giao thc h PIR (PIR-like protocols). Tuy nhin, cc giao thc
ny ch h tr user privacy m khng m bo data privacy. Gertner pht trin
mt giao thc xy dng trn mt giao thc h PIR bt k cho php tha mn c hai
yu cu data privacy v user privacy, gi l SPIR (symmetrically private information
retrieval).
D liu dng cy (tree-structured data) Lin v Candan [2] ra mt gii thut cho php ngi dng c th che du d liu v
cc truy vn trn d liu dng cy. Lin v Candan a ra hai k thut: redundancy
access v node swapping.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 15/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- Redundancy access: khi ngi dng truy cp mt node d liu, h thng tr v
m node, trong m-1 node l ngu nhin hn ch khng cho server c th
bit c ngi dng thc s truy cp vo node no. Tuy nhin, nu mt node
c truy cp thng xuyn, node root, server c th giao cc redandancy set
pht hin ra node ny.
- Node swapping: mt node sau khi c truy xut s c hon chuyn sang
mt node khc. lm c iu ny, trong m-1 node ngu nhin c mt node
trng (empty node), node cn c s c hon chuyn vi node trng ny v
cp nht xung CSDL. K thut ny gii quyt c vn m redundancy
access gp phi.
[2] cng ch ra nm vn cn gii quyt v gii php cho chng:
1. Qun l danh sch cc empty nodes. Bng cch s dng mt node c bit
snode qun l cc eheads, etails bit c danh sch empty nodes.
2. Phng php chn ngu nhin cc node cho redundancy set.
3. m bo tnh ton vn ca mi quan h cha-con ca node b hon chuyn. Lin
v Candan ra hai gii php: (1) xc nh empty node s hon chuyn node
cha v cp nht lin kt vo node cha trc khi c node con ln. (2) ghi nh
ng dn cc node t root n node cn truy cp, sau mi thc hin hon
chuyn t di ln. [2] cng ch ra rng: gii php (1) l kh thi trong khi (2) l
khng kh thi.
4. Cc vn khi c s truy cp d liu ng thi. Trong [2], tc gi ra gii
php kha cc node khi truy cp. T , a ra nhng iu chnh m bo gii
thut khng b deadlock.
5. Vic chn gi tr cc thng s bo mt nh th no l hp l (m kch thc
ca cc tp d tha, redundancy set, s kch thc ca node).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 16/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Lin v Candan xy dng gii thut oblivious traversal algorithm, v chng
minh trong [2], nu tn sut truy cp ca cc node c phn b u (uniform
distribution) th gii thut ny t c user privac.
Trong trng hp tn sut cc node khng u nhau (iu ny thng gp trong thc
t), Lin v Candan cng nu ra mt s gii php trong [4]: dummy node access,
replicate frequently accessed nodes, clique approach. Tuy nhin, [4] cng nu ra cc
yu im ca tng gii php nh sau: s lng dummy node access, s lng
replicated nodes, kch thc redundancy set phi kh ln.
T , [4] xy dng mt gii php t c tnh privacy trong trng hp tn sut
truy cp khng u m trnh c cc hn ch trn, c gi l clustering node
acceses into uniform chains.
Hai giao thc extreme protocols Gii php ca Lin v Candan trong [2] ch ph hp cho m hnh UC-UP. ng thi,
vn cn tn ti mt s gii hn [3, 22]:
- Cha ch ra r rng cch thc cp nht danh sch cc empty nodes, ng thi
cng cha ch ra c vic tn dng li cc empty node.
- Gii thut ny khng h tr cc thao tc insert, delete mt cch trc tip. c
bit l khi xy ra trng hp over-full v under-full i vi cc node.
- Redundancy set ch c mt empty node nn khng h tr c khi xy ra over-
full v under-full.
[1] ra hai extreme protocol gii quyt cho hai m hnh DC-UP v DC-UP-DP.
M hnh DC-UP
DC + UP = Encryption + PIR protocol
Trong trng hp cn cp nht CSDL, PIR protocol c thay bng PIS protocol. V
m bo tnh thc thi, PIR/PIS protocol c thay bng RIR/RIS protocol.
M hnh DC-UP-DP
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 17/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
[1] xut s dng K nh l mt t chc ng tin cy th 3 (trusted third-party) lm
cu ni gia khch hng v nh cung cp dch v. Khi m hnh DC-UP-DP quay
tr li m hnh DC-UP uc gii quyt trc . Tuy nhin, do thng tin l mt
vn ht sc nhy cm, nn tm c mt t chc nh th ny trong thc t l mt
iu ht sc kh khn.
Oblivious operations on dynamic outsourced search trees N h trnh by, gii thut ca Lin v Candan trong [2] khng h tr cc thao tc
insert/delete v mt s gii hn ca gii thut. [3, 22]
[3, 22] cch gii quyt cc gii hn ca gii thut ca Lin v Candan. ng
thi, cng ra gii thut h tr thao tc insert/delete.
Tuy nhin, gii thut oblivious insert ch h tr B+-tree, m cha h tr cc cu trc
cy c bit khc nh: SH-trees, UB-trees, R+-trees, rd-trees [3, 22].Gii thut
oblivious delete c th c m rng h tr cc cy c bit ny, tuy nhin trong
trng hp cu trc cy chp nhn node under-full th gii thut khng th p dng.
Mt iu chnh ca gii thut ny c th gii quyt tt vn under-full node tuy
nhin ch c p dng cho B+-tree.
1.3 Query Assurance Query Assurance m bo kt qu truy vn tr v t server l ng (correctness) v
y (completeness) v mi nht (freshness).
- Tnh ng l cc kt qu tr v l chnh xc c ly t CSDL hay c dn
xut t (trung bnh, tng,) m khng b thay i.
- Tnh m bo kt qu truy vn tr v l y , khng b b st v mt
nguyn nhn no (do server thc hin khng ht cu truy vn, hoc khng
tr v y tp kt qu, hay do tht lc trn ng truyn).
- Tnh mi m bo kt qu tr v t server l d liu mi nht c cp nht t
cc DO. Tnh mi thng c quan tm trong trng hp d liu outsource
c th c thay i.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 18/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Einar Mykletun [7] ra mt gii php m bo tnh ng cho cc cu truy vn
dng ch c (read-only) v khng c tnh ton gp (nh SUM, AVERAGE,). Mi
dng d liu (record) c lu km theo ch k in t ca dng . Kt qu tr v
km theo vi ch k in t. Client kim tra ni dung d liu vi ch k km theo
xc nhn c tnh ng ca d liu. Tuy nhin, do s lng record tr v c th ln,
v vy vic kim tra mt s lng ln ch k in t cho tng dng dn n lng ph
thi gian v l mt chi ph nng n cho client. gii quyt vn ny, [7] ngh
m hnh Condensed-RSA. Theo , thay v kim tra ring l tng ch k ca tng
record, client ch cn kim tra tt c cc record cng lc da trn ch k tng hp
(condensed signature) do server tr v l c th xc nh c tnh ng ca d liu.
[7, 14] cng nu ra mt gii php khc nhm t c tnh ng l s dng Merkle
Hash Tree (MHT). MHT l cy m cc l ca n l kt qu bm ca d liu ca tng
dng tng ng trong CSDL. V nh du node gc bng mt ch k in t chuNn.
N u km theo hai record hai bin kt qu, ta c th chng minh c kt qu tr v
y .
Cu trc MHT i hi phi lu tr km theo mt cu trc d liu chuyn dng
phc v cho query assurance. Mi cu trc ny thng ch p dng cho mt thuc
tnh, nh vy, trong trng hp CSDL c nhiu thuc tnh dng tm (searchable
attribute) i hi nhiu cu trc tng ng, iu ny c th lm tng ph tn lu
tr ti server. Maithili N arasimha [10, 21] ngh mt hng tip cn mi da
trn chui ch k in t. Khi , trong ch k ca mt record c bao gm ni dung
ca record lin trc n (c sp xp theo mt thuc tnh cho trc). N h vy, to
thnh mt chui lin tip nhau. Trong kt qu tr v, server tr km thm hai record
bin c th m bo c tnh ng v y . Hng tip cn ca [10, 21] khng
i hi phi tn thm nhiu khng gian lu tr trn server. Mi dng d liu ch cn
lu thm mt ch k. ln ca mt ch k thng thng l 128 byte, i vi RSA
(64 byte cho BGLS).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 19/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Tuy nhin, chi ph xy dng, to cc ch k v kim tra cc ch k i khi cng ng
k, thng chm hn t 100 1,000 ln so vi vic bm (hashing). [15] xut gii
php da trn Embedded Merkle B-tree (EMB) cho php m bo tnh ng, y
v mi. Vic m bo truy vn ch yu da vo cc php bm. T , c th gim bt
thi gian thc hin tnh ton ch k khi CSDL c thay i cng nh thi gian kim
tra kt qu tr v. [15] ng thi cng l gii php u tin gii quyt c y cc
vn ca query assurance.
Radu Sion [8] a ra mt hng tip cn mi cho php m bo tnh y i vi
kt qu tr v t mt tp cc cu truy vn cn c thc hin (batch of queries).
Hng tip cn ny xy dng mt giao thc da trn vic m rng giao thc ringer.
Da trn cc challenge-token, gi km theo, mt cch ngu nhin, xen k vi cc cu
truy vn cn thc hin, client bit trc kt qu ca nhng cu truy vn ny v so
snh n vi kt qu tr v t server. N u trng khp th m bo kt qu tr v t
server y .
1.4 Nhn xt Phn trn ca ti liu trnh by mt cch tng quan nhng nghin cu v nhng kt
qu hin ti trong ODBS. Cc kt qu ny c tm tt theo dng cy phn ph lc.
Qua , ta c th rt ra mt s nhn xt nh sau:
- Vic m bo data confidentiality c th d dng t c bng cch m ha
d liu trc khi thc hin outsourced.
- Vic m bo user privacy/data privacy trn d liu m ha c rt nhiu
nghin cu trn cc dng d liu khc nhau (XML, RDB) v t c
nhng kt qu rt kh quan c th ng dng c vo thc t.
- Vic m bo query assurance trn ODB. D hin nay c nhiu nghin cu
nhm m bo query assurance tuy nhin kt qu t c vn cn mc hn
ch so vi cc kt qu t c cc lnh vc khc. Cc nghin cu hin nay
c th p ng c tnh ng, tnh y v tnh mi trong vic thc hin
cc cu truy vn. Tuy nhin, hu nh cc cch tip cn hin ti vn cha cp
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 20/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
trc tip n vn query assurance trn CSDL XML. Do tnh cht c th
ca mnh, CSDL XML i hi cn phi c mt s iu chnh c th m
bo query assurance.
Qua nhng ni dung tm hiu trn, chng ti c ra mt s hng nghin cu
tip theo nh sau:
1. Cc n lc nghin cu nhm xy dng cc giao thc cho php che du ngi
dng trong vic khai thc thng tin (nh danh, truy vn ci g, uc tr ci g)
i ngc li vi nguyn tc khng th ph nh trong cc h thng. c bit l
i vi cc h thng thng tin mt, c tnh nhy cm cao, c th to c s cho
cc ti phm tin hc c nhng hnh ng xu. N n chng xy dng mt giao
thc va m bo tnh ring t m vn c th, khi cn thit, chng thc c ai
ly thng tin g? [1]
2. Hu ht cc nghin cu hin nay ch tp trung gii quyt cho m hnh DC-UP.
Mc d [1] trnh by mt giao thc ton din (extreme protocol) h tr m
hnh DC-UP-DP, nhng giao thc ny i hi phi c mt ngi trung gian tin
cy (trusted third-party server) K c th chuyn i m hnh DC-UP-DP
sang tr li DP-UP. Vic nghin cu loi b K cng l mt vn ng
c quan tm. [1]
3. Cc gii thut h tr oblivious operation (insert/delete) vn cn mt s im
cha hon thin. Gii thut insert cha h tr cy c bit nh: SH-tree, UB-
tree, R+-tree, kd-tree. Gii thut delete ban u c th h tr cc cy ny, tuy
nhin nu cu trc cy cho php cc under-full node, th phin bn hiu chnh
ca n ch h tr B+-tree [3, 22]. Mt vn khc cn quan tm l cc gii
thut ny c chng minh l m bo tnh privacy trong trng hp tn sut
truy cp cc node l phn b u. Trong khi, th gii thc, tn sut ny l
khng u v c s chnh lch kh ln v tn sut gia cc node [4].
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 21/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
4. Chin lc thc thi cu SQL trn d liu m ha ca Hacigm vn cn nhiu
l hng bo mt [6]. Vic nghin cu v khc phc vn ny vn l mt iu
ng c quan tm.
5. m bo tnh query assurance i vi kt qu tr v t server mi ch dng
mc x l cc cu truy vn n gin [7, 8] v ch h tr m hnh DC-UP.
c th thc thi cc cu truy vn cp nht d liu v cc cu truy vn phc tp
hn i hi phi cng sc hn na [8].
Trong cc hng nghin cu va nu: (1) c nghin cu v pht trin rt nhiu,
p dng tt vo thc t. (2) l mt vn rt kh khn, hin ti hu ht cc giao
thc u trnh m hnh ny do tnh phc tp ca n. Xt thy trong thi gian gii hn,
cng nh vn cn thiu cc kin thc cn thit v bo mt v ODB; mt khc vn
ny hu nh khng lin quan n CSDL XML nh nu trong ti nn ti liu ny
s khng cp n. (3) (4) ch l nhng kha cnh rt nh v hu nh c gii
quyt. (5), nh trnh by, tuy c nhiu nghin cu nhng kt qu t c vn
cn nhiu hn ch, mt khc cha c nghin cu no v query assurance lin quan
n CSDL XML.
N h vy, trong phm vi ca mnh, ti liu ny trnh by mt hng tip cn nhm
gii quyt vn query assurance trong CSDL XML. Phn tip theo ca ti liu s
trnh by chi tit hn v cc kt qu lin quan trong lnh vc query assurance.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 22/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 2
CC NGHIN CU LIN QUAN
2.1 Khi nim N h trnh by, query assurance l mt yu cu bo mt cn c quan tm trong
hu ht tt c cc m hnh ca ODBS. Query assurance c th c nh ngha thng
qua ba tnh cht cn c tha mn, bao gm: tnh ng, tnh y v tnh mi.
Vic gii quyt trit cc vn ca query assurance vn cn l mt bi ton kh,
i hi nhiu cng sc hn na. Hin nay, hnh thnh cc hng tip cn khc nhau
gii quyt vn ny, bao gm:
- S dng ch k in t (digital signature) chng thc tng dng d liu tr
v t server l ng n, khng b thay i bi server hay thay i trn ng
truyn [7]. Phng php ny ch c th m bo tnh ng m khng th m
bo hai yu cu cn li. Mt s nghin cu gn y m rng vic s dng
ch k in t th gii quyt c tnh y [10, 21]. Tuy nhin, n vn
cha th gii quyt c yu cu th ba (tnh mi) ca query assurance.
- S dng cc cu trc d liu chuyn bit gii quyt c bi ton v tnh
ng v tnh y . Merkle Hash Tree (MHT) l mt cu trc kh in hnh
ca khuynh hng ny [11, 14].
- p dng mt s kt qu trong bi ton tnh ton phn b [13] gii quyt cc
yu cu t ra. Mt kt qu ca hng ny l s dng m hnh challenge-
response m bo tnh y trong vic x l tp cc cu truy vn (batch of
queries). u im ca hng tip cn ny l c th x l c cho dng cu
truy vn bt k [8]. Tuy nhin, hng tip cn ny, nh trnh by, ch c th
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 23/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
p dng cho vic x l tp cc cu truy vn do khng ph hp trong vic x
l cc cu truy vn n l.
- Da vo tnh c th ca d liu, cng nh tnh chuyn bit ca cc cu truy
vn pht trin mt giao thc ring gii quyt cc yu cu ca query
assurance. Truy vn d liu m ha da vo cc t kha (keyword) l mt
v d [12].
Phn tip theo ca ti liu s trnh by chi tit hn v cc hng tip cn ny.
2.2 Hng tip cn dng ch k in t Vic s dng ch k in t chng minh tnh ng ca d liu l mt gii php
ang c s dng hin nay [7]. Trong m hnh unified client model, do ch c duy
nht mt Client ng thi l DO, nn vic s dng ch k in t c th c thay
th bng hm bm mt chiu khng th o trong thi gian tuyn tnh [8].
Vic chng thc d liu c th c thc hin nhiu cp khc nhau, granularity.
C th thc hin chng thc trn mt bng (ton b quan h), mt ct (thuc tnh ca
quan h) hay mt dng d liu (record). Vic chng thc cp bng i hi ton
b d liu ca bng phi c tr v mi c th thc hin chng thc c. iu ny
l khng th kh thi, v hu ht cc cu truy vn d liu ch tr v mt phn (mt s
dng) ca bng m thi. iu ny cng xy ra tng t nu thc hin vic chng thc
cp ct. V vy, vic chng thc cp dng c th c xem l mt chn la
tt nht1. N h vy, mi dng, ngoi cc d liu ca quan h, cn cn c lu tr
thm thng tin v ch k ca dng ny.
Vic thit k mt giao thc chng thc cn phi ch n cc yu t sau [7]:
- Tnh ton ti client: chi ph tnh ton ti client xc nh tnh ng ca dng
d liu.
- Bng thng ng truyn n client.
1 Mt gii php khc l s dng vic chng thc cp trng (field). Tuy nhin, iu ny s dn n s qu ti v mt lu tr cng qu ti v tnh ton trong vic chng thc (do thi gian kim tra ch k in t cng kh ln).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 24/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- Tnh ton ti server: bao gm vic truy sut, tr v cc thng tin dng kim
tra tr v cho cu truy vn.
- Tnh ton i vi Data Owner: chi ph tnh ton cc thng tin dng kim tra
trc khi lu tr vo CSDL.
- Yu cu khng gian lu tr trn server.
Trong , ba yu t u l cn c ch nhiu hn c [7]. Tuy nhin, i vi CSDL
ng (dynamic outsourced database) th cng cn thit phi xem xt n yu t th t
khi thc hin cp nht d liu. Yu t th 5 hu nh khng qu quan trng do cc
thit b dung lng ln ngy cng r.
M hnh ch k in t c chn ph bin hin nay l RSA vi chiu di ca ch k
l 1024 bit (theo nh gi th RSA 1024 c th an ton trong vi thp k ti). Tuy
nhin trong trng hp s lng dng d liu tr v ln th dn n vic lng ph v
mt bandwidth cng nh thi gian tnh ton ti server chng thc d liu. Mt gii
php c p dng l s dng m hnh Condensed-RSA. Condensed-RSA l mt m
hnh ch k in t bao gp. Gi s c tp t message {m1,,mt} vi tp ch k tng
ng {1,, t), ch k Condensed-RSA c tnh bi: 1,t = i i (mod n) , i = 1..t
Khi vic kim chng ch k 1,t tng ng vi vic kim chng t cha k i ring l. Mt li im khc l kch thc ca Condensed-RSA bng vi kch thc
ca mt RSA chuNn. N h vy, thay v tr v ton b cc ch k ca tng dng ring
l, server ch cn tnh ton ch k Condensed-RSA v tr v cho client c th thc
hin vic chng thc d liu.
Maithili v G.Tsudik [10, 21] a ra mt hng tip cn mi m bo tnh an ton v
hiu qu cho cc cu truy vn c s m khng i hi thm bt k mt cu trc d
liu phc tp no. Hng tip cn ny gi l Digital Signature Aggregation and
Chaining (DSAC). t c tnh ng, [10, 21] s dng li cch tip cn cp
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 25/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
trong [7]. Do , phn tip theo ca ti liu ch trnh by bin php t c tnh
y .
Tnh y Tnh y t c bng cch xy dng mt mi lin kt bo mt gia cc ch k
ca tng record, gi l signature-chain. Chui lin kt ny t c bng cch thay
i cch tnh ch k ca tng record nh sau:
Sign(r) = h(h(r)||h(IPR1(r))|| h(IPRl(r)))SK
Trong , h() l hm bm m ha (nh SHA), IPRi l record lin k trc dc theo
chiu i, l l s chiu c th thc hin truy vn, SK l kha ring ca data owner.
Cc record lin k trc ca mi record c xc nh bng cch sp xp quan h R
theo cc chiu c th truy vn, nh hnh sau:
Hnh 2.3. Sp xp quan h R theo cc chiu truy vn.
Cc record lin k trc ca R5 ln lc l R6, R2, R7. Khi , ch k ca R5 c
tnh nh sau: Sign(R5) = h(h(R5)||h(R6)||h(R2)||h(R7))SK.
Cch thc chng minh tnh y , v mt nguyn tc, l tng t nh phng php
dng trong AuthDS. N gha l, chng minh mt kt qu tr v ca mt cu truy vn,
server tr v chui ch k hai record bin ca kt qu cng vi cc chui ch k ca
hai record cn bin kt qu. T c th chng minh c kt qu tr v l y .
2.3 Hng tip cn s dng cu trc d liu c bit Mt hng tip cn khc nhm tha mn cc yu cu ca Query Assurance l s dng
cc cu trc d liu c bit lu tr cc thng tin gip cho vic m bo tnh ng
cng nh tnh y .
tng ca hng tip cn ny c th hin nh sau [14]:
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 26/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Hnh 2.4. M hnh chng minh truy vn.
Ch k tng hp (Summary-signature) c tnh ton quy t di ln theo phng
php bm trn ton b cy ch mc (B-tree) i vi ton b cc record trong mt
relation. Gi tr ny c k bng sk0. Cc truy vn ca user c publisher thc thi
tr v kt qu cng vi mt cu trc d liu khc gi l verification-object, c dng
chng minh l kt qu tr v l ng v y .
Hng tip cn ny c mt s c tnh nh sau [14]:
- User ch cn tin cy vo kha pk0 ca owner. Owner ch tnh ton li ch k
tng hp khi thc hin cc cp nht, thay i trn CSDL. V vy kha ring sk0
hon ton c th c bo v offline, iu ny trnh c s tn cng t mng.
N goi ra, cn c th s dng phn cng hin thc kha ny.
- User khng cn thit phi tin cy cc DO. V vy, khi c s c vi mt
publisher no , th hu qu ch l mt i dch v cung cp bi publisher ny.
- Kch thc ca verification-object l tuyn tnh vi kt qu tr v ca cu truy
vn v tng quan logarit vi kch thc ca CSDL.
- Verification-object m bo rng kt qu tr li l chnh xc v y .
- Chi ph tnh ton ch k tng hp, verification-object (VO) v kim tra VO l
chp nhn c.
Mt cu trc in hnh ca hng tip cn ny l Merkle Hash Tree (MHT) [11]. Cy
MHT c xy dng da trn tp gi tr x1, x2,, xn c sp th t ca mt thuc
tnh trong mt quan h. Mi l ca cy c mi lin kt vi mt gi tr xi s cha gi tr
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 27/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
h(xi), trong , h() l hm bm mt chiu, chng hn nh MD5, SHA-1. Cc node
trong ca cy s cha gi tr bm ca hp tt c cc gi tr ca cc node con ca n.
Gi s v c hai node con l v1 v v2, th gi tr ca v l h(v1||v2). Cui cng, gi tr ti
node root s c xc thc bi ch k in t.
Tnh ng (correctness) chng minh tnh ng ca kt qu truy vn, server tr v VO cha co-path ca
node tr v. co-path ca mt node l tp cc node khc t c th tnh ton c
gi tr ca node root. Do ni dung ca root c k, nn so snh vi kt qu tnh
c, server c th chng minh c cu tr li ca mnh l ng. cy MHT di
y, khi kt qu truy vn node 5, server s tr v thm node h1 v h34. T hai node ny
ta c th d dng tnh c root nh hnh v sau.
h1 h2 h3 h43 5 6 9
h34 = h(h3||h4)h12 = h(h1||h2)
root = h(h12||h34)
Hnh 2.5. Binary Merkle Hash Tree.
Trong kt qu tr v {5}, server tr km thm {h1, h34, sign(root)}. Nh vy, client c th tnh c h12 = h(h1||h{5}); root = h(h12||h34). So snh root vi ch k ca root, client c m bo kt qu tr v l ng.
Tnh y
Trc tin ta xt trng hp server tr li cu truy vn l khng c mt reocrd no
trong CSDL tha iu kin truy vn. Khi , server phi chng minh c iu ny,
gi l cc empty proofs. iu ny c th thc hin bng cch tr v co-path ca hai
node k nhau sao cho khong tr cn truy vn nm trong khong gi tr ca cc node
ny.
Tnh y ca cu tr li t c bng cch gi km theo cc empty proofs cho cc
node l ln cn hai node bin nm trong kt qu tr v ca cu truy vn.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 28/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Mt gii hn ln ca AuthDS l i hi phi bo tr mt cu trc d liu phc tp bn
cnh d liu thc s. Cu trc ny cn phi c tnh ton y trc khi a ln
server. Mi thay i cp nht d liu i hi phi tn chi ph khng nh cp nht
li cc s liu trong cu trc [8, 10, 21]. Bn cnh , c th m bo tnh ng
cng nh y ca cy truy vn theo khong (range-query) i hi phi xy dng
mt cu trc cho tng thuc tnh, theo tng trt t sp xp (sort-order) [10, 21].
2.4 Hng tip cn Challenge Response. Radu Sion [8] a ra mt giao thc m bo tnh ng v tnh y ca cc
cu truy vn dng bt k da trn vic m rng giao thc ringer trong tnh ton phn
b (distributed computation).
Giao thc ringer trong tnh ton phn b c a ra trnh gian ln trong vic tnh
ton cc bi ton con. Giao thc ringer c nhiu bin th nh basic ringer, bogus
ringer v hybrid ringer (magic ringer). Xt bi ton sau: tm ra chui text ban u t
chui text m ha bng gii thut DES. Cc trm lm vic (working station) s pht
sinh ra cc chui bng phng php t hp, sau p dng gii thut DES trn chui
ny, nu kt qu trng khp vi chui m ha th chui pht sinh chnh l chui text
ban u cn tm. tng ca basic ringer trong [13] c th c tm tt nh sau:
- supervisor chn ra mt gi tr ngu nhin xi trong min tr Di m trm lm
vic i tnh ton trn n, sau tnh yi = DES(xi). Sau , supervisor gi cho
trm i gi tr yi v y. Trong , y l chui m ha cn gii m.
- Trm i nhn c min tr Di v thc hin tnh ton, nu vic tnh ton l
hon chnh (complete) th chc chn trm i s tm ra c gi tr ca xi hay x (y
= DES(x)) . Khi tr kt qu v cho supervisor, l xi (c th c c x). Khi
supervisor c th bit c thc s trm i thc hin y cng vic.
hin thc, [8] a ra mt cch t chc d liu nh sau. Gi s S l d liu
outsource. S c phn thnh nhiu on Si, mi Si s c xc nh bi mt hm
bm dng m bo d liu l chnh xc, khng b thay i. Gi tr ny gi l
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 29/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
identity-hash, c s dng chng thc cc cu truy vn identity query, l
nhng cu truy vn tr v ton b d liu trong Si.
Qu trnh thc thi cc cu truy vn nh sau:
- Trong tp cc query Q {Q1, Q2, Q3, .. Qa} cn thc thi, querier s chn vo cu
query Qx ti mt v tr bt k, querier bit trc kt qu tr v ca Qx. ng
thi, querier tnh ton mt challenge token bng {H(||(Qx)), }. Trong , H() l hm m ha mt chiu bt kh o (non-invertible one-way hashing
function); : l mt gi tr duy nht theo thi gian (timestamp) m bo challenge token l duy nht; (Qx) : l kt qu tr v c bit trc bi querier.
- N him v ca server l thc thi cc cu query v xc nh c gi tr x bng
cch p dng hm H() cho cc kt qu. V gi km x v cng vi kt qu ca
cc truy vn.
N u ch s dng mt challenge token th c th dn n trng hp server sau khi
tm c challenge token ri th ngng thc hin cc cu truy vn khc (hoc thc
hin khng y ). [8] cng trnh by mt s phng php khc phc vn .
u tin l c th s dng nhiu challenge token thay v mt. Tuy nhin v mt hnh
thc th cch thc ny thc ra vn khng th gii quyt c vn cn bn. N gha
l, vn c trng hp server sau khi nhn din c y cc challenge token th
ngng thc hin cc query cn li. ng thi vic pht sinh nhiu challenge token
tht s l mt gnh nng tnh ton cho cc thin-client (querier) nh cc thit b di
ng (mobile client). Mt gii php khc c ra l s dng cc fake token. Fake
token l mt challenge token gi, ngha l querier pht sinh ra ngu nhin ra mt
challenge token m khng cn quan tm n kt qu tr v t server. Trong tp cc
challenge token c gi n server s bao gm r challenge token thc s v f token
gi. Hai tham s r, f l c th thay i theo tng tp truy vn. V vy, server khng th
no c th xc nh c ton b s challenge token c gi n. Do , bt buc
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 30/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
server phi thc hin y cc cu query c th tm ra c cc challenge token
tht s.
N hng phng php trn ch tp trung cho gii quyt cc cu truy vn c d liu
(select query) ch cha gii quyt vn cho cc cu truy vn cp nht/thm mi d
liu (update/insert query). Vic cp nht d liu thc hin bng cch c ton b
on d liu c cha dng cn update (hay s cha hng insert). Sau thc hin cp
nht d liu, tnh ton li identity hash ri cp nht tr li server. Tuy nhin, vic
x l tnh hung cp nht d liu vn ch l bc khi u [8].
2.5 Hng tip cn da vo c th ca bi ton D dng nhn ra rng, c th xy dng c mt giao thc m bo cc yu cu
ca Query Assurance trong trng hp tng qut l mt cng vic cc k kh khn.
Do vy, mt hng tip cn mi gii quyt bi ton ny mt cch tng i hon
chnh l i vo gii quyt n trong tng trng hp d liu, truy vn c th..
Radu Sion, Bogdan Carbunar [12] trnh by mt giao thc dng truy vn cc d
liu m ha da trn t kha c th m bo cc yu cu privacy cng nh query
assurance.
Cc ti liu c lu thnh nhng vng ring l (file), mi ti liu s c mt s lng
t kha (keyword) nht nh. S cc t kha cho mi ti liu c lit k trc khi
c a ln server. Bi ton t ra l c th thc hin truy vn ti liu da v mt
hay nhiu t kha cho trc.
Mi ti liu c gn vi mt con s nh danh di ngu nhin, duy nht, khng lin
quan n ni dung ca ti liu . Do s lng t kha c xc nh trc, ng
vi mi t kha ny, xy dng mt tp cc nh danh ca ti liu c cha t kha, gi
l KDS (keyword document sets). Cc KDS c th c t ti chnh cc querier. Hay
c th c t server. Lc ny, cc KDS cn c m ha v c chng thc.
Mi KDS c th c m ha bi mt kha khc nhau. Kha ny c th c tnh
nh sau: Keyi = H(key || ki), trong : H() l hm bm mt chiu; key : l kha dng
chung; ki l t kha tng ng vi KDSi. trnh trng hp server c th loi b
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 31/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
mt entry trong KDS, mi KDS cn c b sung thm mt gi tr dng kim tra,
gi tr ny c tnh ton nh sau: Hcheck = H(d4||H(d3||H(d2||H(d1||0)))), vi gi thit
KDS = {d4, d3, d2, d1}.
Mt phng thc lu tr cc KDS l di dng ma trn, ct l cc nh danh ca
tt c ti liu, dng l tt c cc t kha, mi cell trong ma trn s nhn gi tr 0 hay 1
ty vo ti liu c cha t kha hay khng. Ma trn ny, gi l ma trn C, s qua
mt php bin i tr thnh C nh sau:
Ci,j = last_bit(F(ki , Rj , Cij))
trong : F l hm bitwise pseudo-random function, Rj l mt s ngu nhin pht sinh
bi hm sinh s ngu nhin G vi mt random seed R c nh.
Cu truy vn query = {k1, k2,, kq} c thc hin bng cch gi yu cu n server
ly v cc hng tng ng vi cc t kha. Querier ln lt pht sinh li cc gi tr
Rj , v thc hin tnh li ma trn Cij nh sau: nu last_bit(F(ki , Rj ,0)) = Cij th Cij = 0,
ngc li Cij = 1. Sau khi tnh li c ma trn C, querier hon ton c th xc nh
chnh xc c danh sch cc nh danh ti liu cn tm. T cc nh danh, querier c
th yu cu server tr v ng cc ti liu yu cu.
Do querier bit c chnh xc danh sch cc nh danh ti liu cn ly, nn, mt
cch hon ton t nhin, c th kim tra c tnh y ca kt qu tr v. m
bo tnh privacy, c th s dng cc PIR-like protocol.
2.6 Bo m truy vn cho d liu dng cy Mt hng tip cn tip theo da trn k thut ca Lin v Candan [2] trong vic m
bo tnh ring t . [16] m rng phng php ny cho php m bo ba yu cu ca
query assurance, vi ni dung c bn nh sau.
- m bo tnh ng, mi record u c ch k cho ring n (RSA). Trong
trng hp cn so snh tnh ng ca nhiu record, c th s dng m hnh
Condensed RSA.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 32/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- Tnh c m bo mt cch t nhin, do querier yu cu mt s lng
nht nh node. Khi , cn c vo s lng node yu cu v s lng node do
server tr v, hon ton c th tha mn c yu cu ny. N goi ra, m
bo y chnh l nhng node yu cu, trong d liu ca m ha ca mi node,
ta lu tr nh danh (nodeID) ca chnh node . Do tp cc nh danh node
yu cu l bit c, so snh vi cc nh danh ca cc node do server tr v
c th chng minh c y thc s l cc node mong mun.
- Tnh mi c tha mn bng cch bng cch b sung vo mt s thng tin
nh sau:
o Mi node cha thm mt gi tr thi gian (timestamp) cho bit thi gian cp nht ca node ny.
o Node cha lu ton b gi tr thi gian (timestamp) ca cc node con.
o Gi tr timestamp ca node gc l ph bin cho tt c cc querier.
N h vy, khi qu trnh duyt i t node gc xung cc node con, vi gi tr
timestamp bit trc, querier hon ton c th chng thc ni dung ca
node gc l mi. Cn c gi tr timestamp ca cc node con c lu tr ti
node gc, querier cng xc nh c tnh mi ca cc node con ny. V c
nh vy, lan truyn ti node l c th chng minh tnh mi ca node ny.
Nhn xt
Phng php tip cn ca [16] va m bo c tnh ring t (user privacy v data
privacy) do tn dng phng thc redundancy data access v node swapping ca Lin
v Candan [2, 4], v c th m bo c query assurance. Tuy nhin hng tip cn
ny ch ph hp cho cc cu trc d liu dng cy tm kim (search tree) nh B-Tree,
B+Tree, R-Tree,, cc cu trc ny c dng ph bin s dng lm ch mc
(index) cho cc d liu khc.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 33/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
2.7 Nhn xt N goi tr trng hp ng dng cho bi ton c th c trnh by mc 2.5, cc
phng php khc, v mt bn cht, u p dng ch k in t hay ng dng tnh
cht khng th o trong khong thi gian tuyn tnh ca cc hm bm (secure hash)
chng minh tnh ng cng nh tnh y ca kt qu truy vn.
Hng tip cn ca Maithili, G.Tsudik [10, 21] v Prem Devanbu [14] cho php
chng minh c chnh xc kt qu tr v l ng v y . Tuy nhin, hin ti, cc
giao thc ny vn ch c th gii quyt cho cc cu truy vn ch c n gin khng
c cc hm bao gp (nh SUM, AVERAGE,). Mt khuyt im ca phng php
ny l ph thuc vo dng thc ca cu truy vn. i hi phi phn tch cu truy vn
thnh tng phn ring l c nhng tc v thch hp.
Hng tip cn ca Radu [8] c th p dng cho tt c cc loi truy vn, k c vic s
dng cc hm gp m [10, 14, 21] cha gii quyt c. u im chnh ca phng
php ny khng cn phn tch c php ca cc cu truy vn. T , c th trin khai
d dng hn. Tuy nhin, hng tip cn ny vn cn mt s iu cn xem xt nh
sau.
- Ch p dng cho tp cc cu truy vn, cha gii quyt cho trng hp thc thi
tng cu truy vn ring l, vn c s dng kh nhiu trong thc t. gii
quyt vn ny c th s dng cc hng nh sau: (1) s dng cc fake-
query km theo bin cu truy vn n thnh tp cc cu truy vn. Tuy
nhin, cch ny c th lm qu ti server, gim hiu nng ca ton h thng do
phi thc hin cc fake-query qu nhiu so vi cc truy vn thc s. (2) cc
cu query ring l c tp trung li ti mt trust-server v gi n server di
dng tp cc cu truy vn theo ng tinh thn ca gii php. Phng thc ny
hu nh khng kh thi do thi gian tr ca cu query trong thi gian ch i l
khng th chp nhn c. (3) l kt hp ca (1) v (2).
- Cha chng minh trit kt qu tr v l y . Xc sut server khng
thc thi hoc thc thi khng hon chnh i vi cu query cui cng l 33%
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 34/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
[8]. y l mt xc sut kh cao. iu ny phn no lm gim bt tnh tin cy
ca gii php.
Cc hng tip cn trn u c thc hin cho cc d liu dng quan h (relational
database). Do , c th p dng c trong CSDL XML cn phi c mt s thay
i nht nh.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 35/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 3
D LIU XML
XML l mt dng d liu bn cu trc (semistructured data), dng cy (tree-
structured). n v lu tr thng tin ca XML l cc node v attribute. Cc node v
attribute phn bit thng qua tn v tm vc ca chng (chiu su ca cy, node cha).
D liu XML l dng vn bn c c. V vy, trc khi tin hnh outsource, ta cn
phi xc nh c cu trc lu tr ph hp vi cu trc d liu ca XML. iu ny
c bit quan trng, n nh hng rt ln n cc phng php s c p dng trong
x l truy vn v m bo truy vn.
3.1 M hnh lu tr Tng t nh RDB truyn thng, mi ti liu XML u c c trng bi mt lc
(schema) nh ngha mi quan h cha con gia cc node, s lng thuc tnh ca
ca mi node. V d nhin, lc ny c dng cy, gi l schema tree.
Mi node trong ti liu XML (xml element) tng ng vi mt t-node trong cy lun
l, mi thuc tnh ca xml element s tng ng vi a-node trong cy lun l. Hnh 6
v d v cy d liu lun l v cy cu trc rt ra t mt ti liu XML.
T cy cu trc, c th d dng chuyn i ti liu XML sang cc dng lu tr khc.
Ti liu ny trnh by hai phng php thng dng lu tr ti liu XML: dng
bng (table-based) v dng node (node-based).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 36/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Root
Customer Customer
Order Order
Code, 01 Amount, 5000 Code, 02 Amount, 10000
Name, Bob Name, Alice
t-node t-node
t-node
t-nodet-node a-node a-node
a-nodea-nodea-nodea-node
(A)
(B)
Root
Customer
Order
Cy cu trc ca ti liu XML
Cy d liu lun l
Name
Code Amount
a-node
t-node
Hnh 3.6. Cu trc cy lun l ca mt ti liu XML.
Trong cy d liu lun l v cy cu trc, mi node hnh ch nht gc trn i din cho mt node (element) trong ti liu XML. Cc node hnh ch vung gc i din cho mt thuc tnh (attribute) trong mt element ca ti liu XML.
3.1.1 Dng bng: Table-based T cy cu trc ca ti liu, c th chuyn i sang dng lc quan h theo cc
bc sau.
- Gn nhn (labeling) cc t-node cu trc sao cho mi node c mt gi tr nhn
duy nht.
- Mi t-node cu trc c chuyn thnh mt bng tng ng c tn l tn ca t-
node kt hp vi gi tr nhn. Cc a-node con ca t-node ny c chuyn
thnh cc ct ca bng. Mi bng b sung thm ct nodeid l nh danh ca
node trong bng d liu. N u t-node c cha, th b sung thm ct pnodeid tham
chiu n bng pht sinh t t-node cha.
T cy cu trc hnh 3.6.b, ta thc hin gn nhn cho cc node:
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 37/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Hnh 3.7. Cy cu trc sau khi c gn nhn.
Sau , chuyn sang lc quan h nh sau.
Root_01(nodeid)
Customer_02(nodeid, name, pnodeid)
Order_03(nodeid, code, amount, pnodeid)
N goi ra, mi table cn c b sung thm mt s ct nh timestamp, sign, cc gi
tr ny c dng gip vic chng minh kt qu truy vn tr v sau ny.
u im Ti liu XML sau khi c chuyn i sang dng lc quan h t c
th p dng cc kt qu trc dng trong lc quan h. Ta c th p
dng bin php DSAC [10, 21] hay EMB Tree [15] c th m bo querry
assurance trong vic truy vn.
Khuyt im V pha ngi dng, CSDL c outsourced l ti liu XML, v vy cu truy
vn c thc hin thng thng l mt dng query trn ti liu XML (XPath,
XQuery,). Do , cn phi c mt bc chuyn i t ngn ng truy vn
ny sang ngn ng SQL bnh thng.
Mt vn cn c quan tm l: bn cht schema ca CSDL c th c thay
i ng bt c lc no. Mc khc, vic thay i schema ca ti liu XML l
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 38/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
kh linh ng. Tuy nhin iu ny dn vic thay i cu trc bng RDB tng
ng. iu ny c tc ng khng tt n d liu c lu tr (vic thm ct
d liu v mt bng c th dn n vic tnh ton li ton b cc ch k in
t, nu s dng phng php DSAC). M ha li ton b d liu. iu ny l
khng th trong trng hp d liu c outsourced.
Tuy cn tn ti mt s khuyt im, nhng trong trng CSDL XML khng thay i
v schema th vn c th p dng phng php ny c th tn dng c cc kt
qu c nghin cu tt trn CSDL quan h. Trong ti liu ny, chng ti cp
n mt hng tip cn khc da trn phng php lu tr d liu th hai: node-
based.
3.1.2 Dng node: Node-based Mt hng tip cn khc trong vic lu tr CSDL XML l lu cc t-node v a-node
ca cy d liu lun l.
Tng t nh phng php trn, u tin, cc node cu trc (bao gm c t-node v a-
node) u phi c gn nhn. Phng php gn nhn tng t nh trn (ch khc l
vic gn nhn bao gm c a-node). Khi , vic lu xung CSDL quan h s tn ti
hai bng d liu lu t-node v a-node c ni dung nh sau:
t-node(nodeid, xtype, datatype, nameid, pnodeid, lmaid, value) a-node(nodeid, xtype, datatype, nameid, pnodeid, sibid, value)
Trong 2:
- NodeID : l nh danh ca node.
- XType : dng phn bit cc loi i tng.
- Datatype : dng xc loi d liu
2 N goi cc thnh phn nh trn, ty theo cc gii thut chng thc khc nhau m cn b sung thm mt s cc thng tin khc vo t-node v a-node.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 39/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- NameID : nh danh ca tn ca node (t-node v a-node) . Tn ca mt node
c phn bit da vo ng cnh m tn xut hin. Mi tn s c nh
danh bi mt ch s duy nht trong ton CSDL.
- PNodeID: nh danh ca t-node cha ca t-node hin ti. Ch : i vi a-node,
pnodeid l nh danh ca t-node cha ca t-node cha ca a-node hin ti.
- LMAid: nh danh ca a-node tri nht.
- SibID: nh danh ca a-node anh em bn phi.
Vi dng thc lu tr nh vy, vic thay i schema (b sung/b bt mt thuc tnh)
ch n thun nh mt tc v insert/delete n gin, v ch nh hng n node hin
ti. Do , khng i hi thm bt k mt chi ph no khc m vn m bo c cc
yu cu bo mt t ra.
1. u im
Phn nh ng bn cht dng cy ca ti liu XML. Do , khc phc c
khuyt im ca phng php table-based, vic thay i cu trc ca ti liu
XML khng nh hng nhiu n ni dung lu tr hin ti. V ch nh hng
n node cn cp nht.
S dng c mt s kt qu nghin cu trc [2, 16] trong vic bo m
cc vn ca query assurance.
2. Khuyt im
Do tt c cc t-node, a-node c lu thnh nhng record ring l nn s
lng record c th tr nn rt ln so vi table-based. iu ny lm tng tnh
phc tp ca database.
N goi ra, cc bin php ch mc (indexing) trn RDB p dng khng my hiu
qu i vi d liu dng cy nh XML. Trong khi, cc phng php ch mc
chuyn cho XML vn cn trong giai on pht trin.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 40/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
3.1.3 Nhn xt Trong hai phng php lu tr cp nh trn, phng php table-based chuyn
i ti liu XML sang dng table ca RDB truyn thng. T c th p dng li
c cc bin php bo m query assurance [10, 11, 14, 15, 21]. N hng phng
php ny c mt khuyt im kh ln l: khi cu trc ca ti liu XML thay i bng
cch b sung mi mt node mi hon ton (hay mt attribute mi) th cu trc ca
cc bng d liu s b thay i theo. iu ny i hi mt khi lng tnh ton kh
ln bm bo cc cu trc c dng trong m bo truy vn (bao gm vic k li
cc record, m ha d liu, xy dng li cc chui ch k hoc cc cu trc index
phc tp khc,)
Trong phm vi ca ti liu ny, chng ti s p dng phng php lu tr node-
based, ng thi ngh mt cu trc ch mc (indexing structure) chuyn dng cho
ti liu XML. T , nhng km mt s thng tin chng minh tnh ng, tnh y
v tnh mi.
3.2 Ch mc cho ti liu XML Ch mc l mt khi nim ht sc quan trng trong CSDL. N gip tng tc ng k
hiu sut truy vn d liu so vi phng php tm kim tun t c in, trong trng
hp l tng, tm kim s dng ch mc nhanh hn tm kim tun t l N/log2N ln.
i vi CSDL quan h (relational databases), ch mc c p dng ht sc c hiu
qu v ph bin trong hu ht cc RDBMS. Cc cu trc ch mc thng dng l :
bng bm (hash table), bitmap v cc cu trc ch mc dng cy.
i vi CSDL bn cu trc dng cy nh CSDL XML, hin ti c nhiu nguyn cu
trong vic xy dng ch mc ph hp[17, 18]. Trong phm vi ca mnh, ti liu ny
khng i chi tit vo cc phng php ch mc cho ti liu XML v cng khng c
nh so snh chng, m ch a ra mt cu trc ch mc cho ti liu XML, m qua
c th nhng vo mt s thng tin nhm phc v cho mc tiu m bo query
assurance.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 41/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Trong cc phng php ch mc, phng php ch mc dng cy c s dng kh
ph bin. Trong in hnh l B+Tree c p dng rt thnh cng trong vic to
ch mc trn cc RDBMS hin ti. Vi c tnh ca mnh, B+Tree c th to ch mc
cho mt s lng ln cc record m phc tp khng cao (nu cy B+Tree c
fanout l 100, chiu cao l 4 th c th qun l c 100x100x100 = 1.000.000
record).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 42/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 4
M BO TRUY VN
4.1 Phng php m bo truy vn (Query assurance) nhm mc tiu chng minh vi ngi dng kt
qu truy vn tr t server l: ng, , mi. Tnh ng c th thc hin kh d dng
thng qua ch k in t. Chng minh tnh ca kt qu truy vn thng da vo
tnh cht ca tng loi query.
Xt hai loi truy vn: truy vn theo vng (khong gi tr tha mn) v truy vn theo
im (bng mt gi tr c th). Truy vn im thc cht l dng suy bin ca truy vn
vng vi hai cn tin n bng nhau. N h vy ch cn xem xt i vi truy vn vng.
Xt mt truy vn theo vng i vi cn l LB v UB. Kt qu tr v l cc record
tha mn.
S = { R | R LB, R UB }
N u cc record R c m bo l sp xp tng dn (hoc gim dn), chng minh
kt qu tr v l y , server ch cn tr v hai record nm hai bin :
S = S {RL | RL = max(Ri), Ri < LB, i} {RU | RU = min(Rj), Rj > UB, j} N u server c th chng minh c gia RL v RU ch c cc record tr v l c th
chng minh c kt qu l hon ton y .
N h vy, cc record cn c sp xp theo th t v th t ny c th chng minh
c. Mt cch n gin t c iu ny l sau khi sp xp R theo th t, ta tnh
gi tr bm nh sau:
S = h(h(R0) | h(R1) | h(R2) |.| h(RN))SK
Trong : h l hm bm bo mt (security hash) nh SHA1, MD5.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 43/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Sau thc hin k ln gi tr bm va tnh c bng gii thut m ha bt i xng
(nh RSA). Trong kt qu tr v, server tr v km theo cc h(Ri) cn li v gi tr S.
Client hon ton c th tnh li bm ca cc record theo quy tc trn v thc hin
kim chng vi S bng kha cng cng ca gii thut k.
4.2 Nested B+ Tree N h trnh by phn trn, phng php p dng chng minh tnh y cn
bn da trn dy th t cc record v ch k ln dy th t ny. iu ny c th t
c bng mt cu trc ch mc ph hp vi cch thc lu tr d liu c trnh
by mc 3.1 ca ti liu ny.
Xt mt ti liu XML, cc node c nh v bi path, tc ng i t node gc n
node hin ti. Cc truy vn trn XML thng thng c xc nh path. N h vy, ch
mc XML, ngoi gi tr ca node, cn phi cha thm thng tin v path ca node.
Quay li phng php lu tr c cp phn trn, gi tr nameid l duy nht
i vi mi node cu trc, do c th c s dng tng ng vi path. N h vy,
mi node cn c ch mc trn b hai thuc tnh (nameid, value).
Tuy nhin, ngoi vic truy vn theo gi tr, vi bn cht cha/con ca d liu dng cy
XML th yu cu truy vn cc node con khi bit c node cha l thng xuyn.
chng minh completeness cho cc truy vn ny, cn b sung thm ch mc ca b
ba gi tr (nameid, pnodeid, value) cc record c sp xp theo node cha.
Ta c th xy dng hai cy ch mc ring l cho hai b gi tr trn. Tuy nhin, iu
ny c th dn n mt s vn phc tp trong vic cp nht d liu, chng minh
truy vn v lng ph ni lu tr (do c trng thuc tnh u l nameid).
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 44/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
... ...
NameTree
ParentTree ValueTree
(nam
eid)
(pno
deid,
value
)
(value
)
Hnh 4.8. Cu trc NB+Tree.
S kt hp ca ba loi cy NameTree, ParentTree v ValueTree cho php sp th t ton b cc attribute v cc element ca ti liu XML theo hai th t: (nameid, value) v (nameid, pnodeid, value), m bo cho vic truy vn nhanh chng cng nh kh nng m bo truy vn trn ti liu XML.
m bo c yu cu trn, c th s dng kt hp cc cu trc cy nh sau. Xy
dng mt cy B+-Tree vi kha so snh l nameid, gi l NameTree. Ti node l ca
NameTree cha gc ca hai cy B+Tree theo kha ln lt l (pnodeid, value) v
(value). Hai cy ny c tn l: ParentTree v ValueTree. Tp hp ba loi cy ny to
thnh mt cu trc d liu, gi l Nested B+Tree, cho php lp ch mc cho ti liu
XML trn hai b gi tr (nameid, pnodeid, value) v (nameid, value).
N goi ra, vic phn b ca d liu cng nh hng mt phn kh ln n cu trc
cy. N u d liu b tp trung vo mt vng nht nh c th lm qu trnh tch node
din ra thng xuyn, lm cho hiu sut s dng ca cy B+ khng cao. hn ch
iu ny, cy B+ a vo thao tc ti phn b (redistribute) d liu cc node gc.
Trong cy N B+, ngoi c tnh trn, cc kha (key) cng c thc hin ti phn b
trc khi thc hin vic tch node hay hp nht (merge) node.
Mt hng tip cn khc, l s dng mt cu trc ch mc a chiu (multi-dimension
index) to ch mc cho ti liu XML da trn b bn thuc tnh (nameid, prefixid,
pnodeid, value) nh h cy R (R-Tree, R+Tree, R* Tree, X-Tree,). Tuy nhin, cc
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 45/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
cu trc ch mc ny khng m bo c th t tng dn ca cc record (hoc ch
m bo cho mt thuc tnh) v vy khng ph hp cho vic chng minh
completeness trong truy vn.
Nested Merkle B+ Tree
N M+Tree c th chng minh kt qu truy vn, cn nhng km mt s thng tin
tng t nh cy Merkle Hash Tree (MHT) vo cy N B+Tree nh sau: gi H(node) l
bm ca mt node bt k thuc N B+Tree, khi h(node) c tnh nh sau:
- Node l node l ca ValueTree, ParentTree: H(node) = h(A(node))
- Node l node l ca NameTree: H(node)=h(H(R(ParentTree))||H(R(PrefixTree)))
- Node l node ni : H(node)= h(I H(Childi(node))) , i = 1..s con ca node. - Node l gc ca NameTree: H(node)= h(||I H(Childi(node))) , i = 1..s con ca
node.
Trong : h l hm bm bo mt tha tnh khng ng v khng th o trong thi
gian tuyn tnh (collusion-free and non-invertable) nh SHA1, MD5. A(node) l t-
node hoc a-node lin kt vi node l ny. R(tree) l node gc ca tree. Childi (node)
l node con th i ca node. l mt gi tr duy nht theo thi gian (timestamp), gi tr no c cung cp ti tt c cc client khi c s thay i. Sau , thc hin k ln ni
dung ca R(PrefixTree) bng gii thut ch k s.
N h vy, vi vic p dng tng ca MHT ln N B+Tree, ta c N MB+Tree
(Nested Merkle B+ Tree), va dng lm ch mc trong truy vn va c dng
chng minh truy vn trn ton d liu XML.
Phn tip theo ca ti liu trnh by cch thc vn dng N MB+Tree trong vic thc
hin truy vn cng nh chng minh kt qu tr v.
4.3 Tc v chn Php chn (Selection) l cu truy vn c x dng thng xuyn nht. i vi RDB,
l pht biu SELECT trong SQL. Trong XML, c nhiu ngn ng c dng truy
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 46/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
vn nh XPath, XQuery. Ti liu ny minh ha cho mt vi dng cu truy vn XPath
thng dng.
Xt mt ti liu XML c cu trc nh sau:
Hnh 4.9. Cy cu trc ca mt ti liu XML
Cc ch s nh k bn cc node l nameid ca node . Ti liu s kho st qua mt
s dng truy vn thng gp.
4.3.1 Dng truy vn lin quan mt node, iu kin n L dng truy vn n gin nht, v d /Customer/Order/Item[@name=TV]: tm tt
c cc hng mc c tn l TV. Theo quy tc gn tn, ta c nh danh tn ca thuc
tnh name ca node Item l 12. N h vy, tr li cu truy vn ny, thc hin tm trn
N MB+Tree vi kha tm kim (nameid=13, value=TV). Kt qu tr v cc a-node
Name_13 ca Item_8. T cc a-node, c th d dng xc nh c cc t-node Item_8
tng ng. Sau , ta c th ly c cc a-node khc (nh price_14, qty_15).
Tnh ng v Tnh y
N goi cc a-node Name tha iu kin, server tr v thm hai a-node bin ca tp kt
qu, v cc bm ca cc node ln cn tnh bm ca node cha, ng thi tr
thm gi tr bm ca cc node anh em ca node cha. Cc kt qu ny gip cho client
c th tnh ngc li c gi tr bm ca nt gc.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 47/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Ti nt gc, t gi tr bm tnh c kt hp vi temp thi gian lu tr, client c th
kim tra vi ch k ca nt gc. Do c tnh mt chiu, bt-kh-o ca cc hm bm
c p dng nn c th chng minh tnh ng (correctness).
Do cc record c sp xp theo th t tng dn theo b thuc tnh (nameid, value),
kt hp vi hai record bin km theo, c th chng minh c tnh y
(completeness) ca d liu.
Tnh mi
Do trong bm ca node gc c cha ng gi tr temp thi gian (timestamp). N u data owner ph bin gi tr ny cho tt c cc client. T client c th xc nh c tnh mi ca d liu.
N h vy, ch sau mt thao tc kim tra, client c th kim tra y c ba vn t
ra ca query assurance chi thng qua cc php bm vi chi ph thp hn rt nhiu vi
cc php ton s nguyn ln ca cc ch k s.
Chng minh khng c record tha mn (empty proof)
Mt phng din khc cn c quan tm l chng minh kt qu tr v l trng,
ngha l khng c record no tha iu kin tm kim. chng minh, server tr v
hai record nm k nhau trong dy th t c gi tr ln v nh hn gi tr cn truy vn.
4.3.2 Dng lin quan nhiu node, iu kin n L dng truy vn tng i phc tp hn lin quan n nhiu node. iu ny tng
ng vi php join query ca RDB.
Xt cu truy vn : /Customer[@name=Marry]/Order/Item : tr v tt c cc hng
mc do khch hng c tn l Mary mua. Tng t nh trn, xy dng mt VO
chng minh cho truy vn (nameid=4, prefixid=1, value=Marry). Tuy nhin, VO
ny ch cn cha cc a-node Name_4 v cc t-node Customer_1 tng ng. Cc a-
node Name_4 c dng chng minh t_node Customer_1 y .
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 48/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
ng vi mi t-node Customer_1 xy dng mt VO m bo query assurance ca
truy vn (nameid=3, prefixid=1, pnodeid=)
ly ra kt qu cho t-node Order_3.
Tng t, ta c th m bo cho kt qu truy vn cho cc node Item tr v.
Tng t cho cu truy vn /Customer[@name=Marry]/Order/Item[@name=
TV] : ly cc hng mc c tn l TV do khch hng tn Marry mua. Phng
php x l kt hp gia cu truy vn trn v x l ca trng hp u tin.
4.3.3 Cc dng khc Hai v d trn l cch thc vn dng cy N MB+ trong vic thc hin truy vn v
chng minh kt qu truy vn. x l cc cu truy vn khc phc tp hn, ta c th
phn tch chng thnh nhng cu truy vn n v t xy dng mt k hoch thc
thi (execution plan) vi cc bc thc hin n. V d, ta c execution plan cho cc
cu truy vn trn nh sau.
/Customer/Order/Item[@name=TV] STEP#1 IndexMethod : Vtree, nameID=13 Condition : equal to [TV] Result level : not included Retrieval : node only StepValue : PNODEID [For each matched items, perform] STEP#2 IndexMethod : DirectIDAccess, id=ParentStepValue Result level : 1 Retrieval : node and all its attributes /Customer[@name=Marry]/Order/Item STEP#1 IndexMethod : Vtree, nameID=4 Condition : equal to [Marry] Result level : not included Retrieval : node only StepValue : PNODEID [For each matched items, perform] STEP#2 IndexMethod : DirectIDAccess, value=ParentStepValue Result level : not included Retrieval : node only StepValue : ID [For each matched items, perform] STEP#3 IndexMethod : Ptree, nameID=3, pid=ParentStepValue Result level : not included Retrieval : node only StepValue : ID [For each matched items, perform] STEP#4 IndexMethod : Ptree, nameID=8, pid=ParentStepValue Result level : 1 Retrieval : node and all its attributes
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 49/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
4.3.4 Hp nht cc VO Trong vic thc thi mt cu truy vn, hu ht cc trng hp, server cn tr v cho
client nhiu hn mt VO. V d, trong trng hp cu truy vn mc 4.2.2, s lng
VO server cn tr v phc thuc vo s lng record Customer tha iu kin
@name=Marry. iu ny c th dn n s qu ti pha Client trong vic kim
tra cc VO.
hn ch trng hp trn, server cn phi thc hin hp nht cc VO ny tr thnh
mt VO duy nht v gi tr VO ny v cho client. N h vy, client ch cn thc hin
kim tra mt ln duy nht c th xc nh query assurance cho ton b d liu
nhn c.
Vic hp nht cc VO c th thc hin c d dng. Mi VO ban u c xc nh
bi mt on cc record lin tc nhau cng vi (ti a) hai record bin. Vi nhiu
VO, ta s c nhiu on. V vy, thay v pht sinh co-path cho tng on, ta c th
thc hin pht sinh co-path cho ton b cc on.
4.4 Cc tc v cp nht d liu Cc tc v cp nht khi d liu c outsourced i hi thm mt s chi ph nht
nh. Trong trng hp mt bn sao ca CSDL c lu tr data owner, s lng
cc ln cp nht xy ra khng thng xuyn (thi gian gia hai ln cp nht l ln)
v s lng cp nht l nhiu, data owner c th thc hin cp nht d liu cc b.
Sau thc hin outsourced li d liu.
y, ti liu trnh by mt phng thc cp nht d liu trc tip c dng
trong trng hp data owner khng lu tr li bn sao ca d liu outsourced hoc
chi ph cho mi ln outsource l kh ln.
4.4.1 Tc v thm mi (insertion) Data owner gi cc d liu cn thm mi (cc xml element, attrbiute) n server.
Server thc hin thm cc record mi vo CSDL, cp nht li cu trc index. Tnh
ton v cp nht cc gi tr bm ca cc node l c lin quan v lan truyn ln n
node gc ca cy N MB+. Server gi tr v cho data owner gi tr bm mi ca node
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 50/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
gc. Data owner pht sinh mt gi tr timestamp mi, kt hp vi gi tr bm nhn
c. Sau thc hin k ln cc gi tr ny. Data owner gi li cho server gi tr
timestamp mi cng vi ch k mi. Server cp nht gi tr timestamp v ch k va
nhn c vo CSDL.
Giao thc cp nht nhiu giai on nh trn c minh ha nh hnh sau.
Hnh 4.10. Cc bc thc hin insert
4.4.2 Tc v xa v cp nht (deletion/updation) Tc v xa/cp nht d liu cng c thc hin thng qua cc bc tng t nh tc
v thm mi. Data owner gi yu cu n server. Server thc hin vic cp nht
xung CSDL ng thi tnh ton li cc gi tr bm ca cc node trong cy N MB+.
Gi tr bm mi ca node gc c tr v cho data owner thc hin k. Ch k
cng gi tr timestamp mi c gi tr v cho server cp nht vo CSDL.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 51/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 5
PHN TCH
Chng ti trnh by gii php lu tr v s dng cy ch mc N MB+ chng
minh truy vn i vi d liu XML c outsourced. Trong phn ny, chng ti
cp n cc kha cnh bo mt cng nh chi ph ca gii php.
Tt c d liu, k c cc node ca cy ch mc, trc khi lu tr u c m ha
bng gii thut m ha i xng (nh Rijndael) nhm m bo tnh bo mt ca d
liu cng nh hiu sut khi thc hin truy vn. N h vy, phng php ny tha mn
c yu cu data confidentiality.
Gii php trn khng cp n tnh ring t. Tuy nhin, do c tnh ca d liu dng
cy, nn hon ton c th p dng cc kt qu ca Lin v Candan [2] c th t
c tnh ring t.
i vi query assurance, ti liu trnh by phng thc gii quyt i vi cc cu
truy vn vng v im ca mt iu kin n. i vi cu truy vn vng vi nhiu
iu kin kt hp, ta c th gii quyt tng t nh cch thc x l cc tc v tp hp
(set operations) bao gm php ton giao v hi nh trnh by trong [10, 14, 21]
- Php Hi : c thc hin n gin nh hai cu truy vn ring bit. Cc VO pht
sinh t hai cu truy vn s c hp nht nh trnh by.
- Php Giao : mt gii php n gin, vi cc tp kt qu t hai cy truy vn con
theo tng iu kin, server chn mt tp kt qu nh hn, ng vi mi kt qu
thuc tp ny khng tha mn tp kia, server tr v mt empty proof. Tt c cc
VO sau cng s c hp nht thc hin kim tra mt ln pha client.
Mt iu cn quan tm trong query assurance l chng minh cho php join. Trong d
liu dng cy nh XML, php join c p dng ch yu cho quan h t node cha
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 52/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
xung node con. y l mi quan h ph bin trong CSDL XML. Cy N MB+ s
dng PTree chng minh truy vn cho php join ny vi chi ph thp nht trong vic
xy dng VO cng nh chi ph kim tra ti pha client.
Gii php ny khng p dng trc tip cho cc cu truy vn c s dng cc hm tnh
ton bao gp (aggregated function). Cho ti thi im hin ti, theo hiu bit ca
chng ti, ch c gii php ca Radu Sion. [8] gii quyt cho trng hp ny, tuy
nhin gii php cng tn ti mt s hn ch nh trnh by trn. i vi cc cu
truy vn ny, ta c th chng minh cc cu truy vn ny tng t nh cu truy vn
vng, vi iu kin l iu kin lc ca cu truy vn ban u. Kt qu ca hm bao
gp c th c tnh ton ti server v c client kim tra li ngu nhin sau khi
chng thc c query assurance cho cc record tha mn iu kin. Hoc hon ton
c tnh ton ti client.
Tip theo, ti liu tin hnh phn tch v mt chi ph ca gii php. Do hin ti, cha
c mt nghin cu trong lnh vc v query assurance cho CSDL XML, nn cc phn
tch chi ph c nh gi da trn chi ph ti a v ti thiu theo tnh ton l thuyt.
ng thi, y, chng ti ch cp n cc chi ph pht sinh nhm m bo query
assurance.
Trc tin, ti liu trnh by mt s k hiu quy c c s dng bao gm:
n Tng s phn t (s element v s attribute).
s Tng s phn t tr v ca mt truy vn.
f Tham s fanout ca cy N MB+.
h Chiu cao ca cy.
L S node l ca cy.
N Tng s node ca cy.
|sign| Kch thc gi tr bm (20 byte cho SHA-1)
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 53/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chi ph lu tr ti server (storage cost): chi ph pht sinh cho vic lu tr ti server
l chi ph dng lu cy N MB+. Cy N MB+ l kt hp bi mt cy NameTree v
cc cy ValueTree, ParentTree cc l ca cy NameTree. Kch thc ca cy
NameTree ph thuc vo s lng node trong cy cu trc (shema tree) ca ti liu
XML. S lng node ny thng l rt nh so vi s lng node ca cy d liu
(data tree) XML, ng thi t bin ng trong thi gian sng ca d liu. Do , chi
ph lu tr ch yu l chi ph lu cc cy ParentTree v ValueTree.
Do s lng ValueTree v ParentTree bin i ty theo s phn t ca schema tree,
v chiu cao ca cc cy l bin i ph thuc v s phn t d liu tng ng vi
mi phn t cu trc. N h vy, tin cho vic nh gi, ta gi s schema tree ca ti
liu XML ch c duy nht mt phn t, v do ch c mt cy ValueTree v mt cy
ParentTree.
D dng nhn ra rng, tng s phn t d liu cc l ca cy ValueTree v
ParentTree l ging nhau. Gi s rng cc phn t d liu l phn bit nhau qua
(nameid, value), ngha l, mi slot ca mt node l bt k ch cha lin kt n mt
phn t d liu duy nht. Khi ta c:
=
=fnL
fnL VTreeVTree 2, maxmin (5.1)
Ta c s node l cn thit ti thiu trong trng hp tt c cc node l u y v s
node l ti a trong trng hp tt cc cc node l ch cha s phn t. T s Lmin v Lmax, c th xc nh c s node ti thiu v ti a ca cy nh sau.
1log minmin += VTreefVTree Lh (5.2)
1log max2
max +
= VTreefVTree Lh (5.3)
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 54/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
111
11 1minmin
min +
=
f
fLNh
VTree (5.4)
121
211
maxmax
max
+
=
f
fLN
h
VTree (5.5)
Cc cng thc (5.2),(5.3),(5.4),(5.5) c th c chng minh nh sau. Gi s, cy B+
c L node l, khi ta c:
S node ti thiu ti
su S node ti a ti su
h Lmin Lmax h-1 Lmin/f 2Lmax/f h-2 Lmin/f2 22Lmax/f2
h-(h-1) Lmin/fh-1 = 1 2h-1Lmax/fh-1 = 1
1log minmin += VTreefVTree Lh 1log max2
max +
= VTreefVTree Lh
N min = Lmin + Lmin/f + Lmin/f2 + + 1 =
f
fLh
11
11 1min
min
+ 1
N max = Lmax + 2Lmax/f + 22Lmax/f2 + + 1 =
f
fL
h
21
211
max
max
+ 1
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 55/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
N h vy, chi ph lu tr pht sinh c th tnh nh sau:
+=++=+=++=
VTreeNTreePTreeVTreeNTreestorage
VTreeNTreePTreeVTreeNTreestorage
NNNNNC
NNNNNC
maxmaxmaxmax
minminminmin
2
2 (5.6)
Thc t, tn ti kh nhiu phn t c kha so snh trng nhau, tc l c b gi tr
(nameid, value) i vi ValueTree hoc (nameid, pnodeid, value) i vi ParentTree
ging nhau. N hng phn t ny chim cng mt slot trong node l ca cy ch mc.
Do , tng s slot s dng trong cc node l lun nh hn tng s phn t cn ch
mc.
N u gi n, n ln lt l s cc phn t phn bit qua (nameid, value) v (nameid,
pnodeid, value), d dng nhn ra rng: n < n < n. N h vy, tnh c chi ph
lu tr, cn xc nh gi tr n v n. Sau thay vo cng thc (5.4), (5.5) xc
nh chi ph cho ValueTree v ParentTree.
Kch thc VO: trong phn tch ny, chng ti ch quan tm n phn d liu phi
thm vo VO c th chng minh truy vn, do , khi nim kch thc VO l kch
thc thng tin thm vo. N goi kt qu truy vn, server tr v hai gi tr kha bin
(value cho ValueTree v pnodeid, value cho ParentTree) cng bm ca record tng
ng ca hai kha trn. Cng vi kch thc ca co-path dng tnh ton bm ca
node gc. N h trnh by phn trn, mt cu truy vn XPath c th c phn tch
thnh nhiu on truy vn vng, v vy c nhiu VO cho cc vng ny. Kch thc
VO tr v l tng kch thc ca cc VO con.
Xt kch thc VO cho mt truy vn, ta c:
S node su S phn t b sung vo
co-path
h
+=f
sLh2 CVOh = f.Lh - s + 2
h-1 Lh-1 = Lh/f CVOh-1 = f.Lh-1 Lh
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 56/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
h-i Lh-i = Lh-i+1/f CVOh-i = f.Lh-i Lh-i+1 Kch thc VO: CVO =|sign| CVOi , i = 1,hNMBTree (5.7)
Chi ph CPU: chi ph CPU (CPU time) l tng thi gian x l k t lc server nhn
c yu cu truy vn cho n khi kt qu truy vn c gii m hon tt pha client
sau khi loi b thi gian cho vic truyn nhn d liu trn ng truyn.
C th chia cc giai on x l mt cu truy vn thnh cc giai on sau:
serv
er s
ide
clie
nt s
ide
Hnh 5.11. Cc bc thc thi query.
- Parse : phn tch cc thnh phn ca cu truy vn, loi b cc k t i din nu
cn thit. C th t mt cu truy vn ban u s c phn tch thnh nhiu cu
truy vn con.
- Plan : xy dng chin lc thc thi cc cu truy vn va phn tch.
- Fetch data : thc hin truy vn trn cy ch mc, c cc record tha mn iu
kin t CSDL.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 57/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
- Build VO : b sung cc thng tin cn thit xy dng cc VO phc v cho vic
chng minh truy vn, bao gm c thi gian to ra cc co-path cho kt qu truy
vn.
- Verify VO : kim tra kt qu truy vn da vo thng tin VO nhn c.
- Generate XML : phc hi li d liu dng XML. Do d liu XML ban u lu
xung CSDL chuyn thnh cc t-node v a-node, nn kt qu tr v phi c
phc hi sang dng XML ban u.
Hin ti, do gii hn v thi gian, chng trnh nh gi khng thc hin hai bc
u l Parse v Plan. Cc cu truy vn c dng thc thi c cung cp di
dng execution plan. Do , cng vic cn li ch bt u t bc fetch data.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 58/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Chng 6
THC NGHIM
nh gi thc nghim, chng ti hin thc gii php trn nn tng .N ET
Framework 2.0, th vin m ha s dng Rijndael v SHA1 cung cp bi .N ET 2.0.
S dng MS SQL Server 2005 Express Edition lu tr CSDL.
Chng trnh th nghim trn h thng PC P4 2.8GHz, 512MB. Ti liu XML mu l
Modial [19] vi 69,846 hng mc (gm 22,423 element 47,423 attribute). Lc
ca ti liu ny c trnh by phn ph lc. H s fanout ca cy N MB+ l 10.
Cc tiu ch c o c bao gm:
- Chi ph lu tr (tng s node ca cy N MB+ theo thc t, so snh vi s node tnh
trn l thuyt trnh by mc 5).
- Kch thc VO (s lng cc hng mc tr km v chng minh truy vn).
- Thi gian thc thi truy vn bao gm cc giai on: fetch data, build VO, verify
VO, generate XML v thi gian tng cng k t lc cu truy vn bt u c thc
thi cho n khi d liu XML kt qu c gii m hon tt. Cc thi gian ny
c so snh trn tng quan s lng record tr v trn tng s record ca
CSDL thy c tnh tng quan ca thi gian thc thi vi cc thng s khc
(yu cu l tng quan tuyn tnh vi kt qu tr v v tng quan logarit vi kch
thc CSDL).
N goi ra, thi gian ny cn c so snh vi thi gian thc thi cu truy vn trong
iu kin khng bo mt o c chi ph pht sinh cho vic bo mt ca gii
php.
ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 59/93
SV: Nguyn Vit Hng HD: TS ng Trn Khnh
Cc s liu o c da trn cu truy vn /mondial/country/city[population > 500000].
K hoch thc thi cho cu truy vn ny c trnh by phn ph lc.
Chi ph lu tr
y, chng ti b qua chi ph lu tr cho d liu XML m ch tp trung vo phn
chi ph pht sinh cn thit lu tr cu trc cy ch mc. Chi ph ny c nh gi
thng qua s lng node cn thit ca cy thc hin ch mc cho s lng cc
phn t cn thit.
nh gi chi ph da trn chi ph tnh ton t cng thc (5.4) v (5.5) cho kch thc
ti thiu v ti a. T kch thc ca CSDL (database size) (s lng phn t), xc
nh c s phn t phn bit nhau qua b thuc tnh (nameid, value) v (nameid,
pnodeid, value) xc nh s lng slot cn lu tr ti cc node l ca cy chi mc
theo (5.4), (5.5). Kt qu trnh by bng di y.
Database Size 10K 20K 30K 40K 50K 60K 70K
Valu
e Tr
ee
Distinct items 5,415 10,743 16,128 20,907 22,279 22,499 24,913Min Leaves 542 1,075 1,613 2,091 2,228 2,250 2,492Min Height 4 5 5 5 5 5 5Min Nodes 603 1,196 1,794 2,325 2,477 2,501 2,770Max Leaves 1,083 2,149 3,226 4,182 4,456 4,500 4,983Max Height 5 6 6 6 6 6 6Max Nodes 679 1,345 2,018 2,615 2,786 2,814 3,116
Pare
nt T
ree
Distinct items 9,126 18,108 27,181 36,376 43,612 50,379 57,497Min Leaves 913 1,811 2,719 3,638 4,362 5,038 5,750Min Height 4 5 5 5 5 5 5Min Nodes 1,015 2,014 3,022 4,043 4,848 5,599 6,390Max Leaves 1,826 3,622 5,437 7,276 8,723 10,076 11,500Max Height 6 6 6 7 7 7 7Max Nodes 1,143 2,265 3,400 4,549 5,454 6,299 7,189
Bng 6.2. Kch thc cy ValueTree, ParentTree min, max
Bng 6.3. Chi ph lu tr
Data size 10K 20K 30K 40K 50K 60K 70K Min nodes 1,618 3,210 4,816 6,368 7,325 8,100 9,160Ma
Recommended