2007.MScThesis

Embed Size (px)

DESCRIPTION

2007.MScThesis

Citation preview

  • TP. H Ch Minh 02/2007

    I HC QUC GIA THNH PH H CH MINH TRNG I HC BCH KHOA

    WX

    NGUYN VIT HNG

    NHNG VN BO MT KHI TRUY VN C S D LIU XML NG C

    OUTSOURCED

    CHUYN NGNH: CNG NGH THNG TIN M S NGNH: 60.48.01

    LUN VN THC S

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 2/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    CNG TRNH C HON THNH TI TRNG I HC BCH KHOA

    I HC QUC GIA THNH PH H CH MINH

    Cn b hng dn khoa hc: Tin s NG TRN KHNH

    Cn b chm nhn xt 1: Tin s NGUYN C CNG

    Cn b chm nhn xt 2: Tin s TRN VN HOI

    Lun vn thc s c bo v ti HI NG CHM BO V LUN VN

    THC S TRNG I HC BCH KHOA, ngy 03 thng 02 nm 2007

  • TRNG I HC BCH KHOA CNG HA X HI CH NGHA VIT NAM PHNG O TO SH C LP T DO HNH PHC

    Tp. HCM, ngy . . . . thng . . . . nm 200. .

    NHIM V LUN VN THC S H tn hc vin: Nguyn Vit Hng Phi: Nam Ngy, thng, nm sinh: 14 thng 01 nm 1981 Ni sinh: Kin Giang Chuyn ngnh: Cng ngh thng tin MSHV: 00703170 I- TN TI: Cc vn bo mt trong vic truy vn CSDL XML ng c outsourced. II- NHIM V V NI DUNG:

    - Tm hiu tng quan cc vn lin quan bo mt CSDL c outsourced. - Tm hiu cc nghin cu lin quan kha cnh Query Assurance. - xut gii php kim tra query assurance cho CSDL XML c outsourced. - Xy dng chng trnh hin thc gii php, o c v nh gi gii php ra.

    III- NGY GIAO NHIM V : ..................................................................................... IV- NGY HON THNH NHIM V: ...................................................................... V- CN B HNG DN: Tin s ng Trn Khnh. CN B HNG DN CN B MN (Hc hm, hc v, h tn v ch k) QL CHUYN NGNH

    Ni dung v cng lun vn thc s c Hi ng chuyn ngnh thng qua. Ngy thng nm 2006 TRNG PHNG T SH TRNG KHOA QL NGNH

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 4/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    ACKNOWLEDGEMENT

    I would like to express my gratefulness

    To my mom and dad who has brought me up and done everything for my life;

    To my advisor, Dr. DangTran Khanh, who has advised me with all his heart;

    To my friends who are always in my side, and especially, to my colleagues

    who are willing to help me complete some parts of the work.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 5/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    ABSTRACT

    With the impressive improvement of the network technologies, database outsourcing

    is emerging as an important trend beside the application-as-a-service. In this model,

    data owners ship their data to external service providers. Service providers do data

    management tasks and offer their clients a mechanism to manipulate outsourced

    database. Since a service provider is not always fully trusted, security and privacy of

    outsourced data are important issues. These problems are referred as data

    confidentiality, user privacy, data privacy and query assurance. Among them, query

    assurance takes a crucial role to the success of the database outsourcing model. To the

    best of our knowledge, however, query assurance, especially for outsourced XML

    database, has not been concerned reasonably in any previous work.

    In this paper, we propose a novel index structure, Nested Merkle B+ Tree, combining

    the advantages of B+ tree and Merkle Hash Tree to completely deal with three issues

    of query assurance known as correctness, completeness and freshness in outsourced

    XML database. Experimental results with real dataset prove the effeciency of our

    proposed solution.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 6/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    TM TT

    Vi s pht trin vt bc trong lnh vc cng ngh mng cho ra i nhiu dch v

    t xa, c bit l s ra i ca dch v application as a service. Dch v ny gip

    cho mi ngi c th tip cn mt cch hp php vi cc phn mm mi nht vi mt

    chi ph thp nht. Thi gian gn y, xut hin xu th mi cho php lm gim chi ph

    v qun l d liu qua mt dch v gi l database outsourcing. Vi dch v ny,

    cc n v, t chc lu tr thng tin, d liu ca mnh ti my ch ca cc nh cung

    cp dch v. Cc nh cung cp dch v s m nhn cc cng tc bo tr my ch, bo

    tr phn mm DBMS cng nh bo tr CSDL ca khch hng. Bn cnh , h cung

    cp cc c ch cho php cc n v, t chc c th thao tc trn CSDL ca mnh. Tuy

    nhin, thng tin vn l mt ti sn ht sc qu bu, nn cc n v hon ton khng

    th tin cy c cc nh cung cp dch v trong vic m bo an ton cho CSDL. Do

    pht sinh cc yu cu bo mt v CSDL outsourced. Cc vn c th tm

    gn trong bn yu cu bo mt, bao gm: data confidentiality, data privacy, user

    privacy v query assurance.

    Ngoi phn gii thiu tng quan v cc kt qu t c trong lnh vc data

    outsourcing, ti liu a ra mt cu trc ch mc mi cho d liu XML. Da trn cu

    trc ny, ti liu trnh by phng php m bo truy vn cho CSDL XML

    outsourced cng nh mt s kt qu thc nghim hin thc cho phng php ny.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 7/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    MC LC

    ACKNOWLEDGEMENT ................................................................................................................ 4ABSTRACT ................................................................................................................................. 5

    Chng 1 GII THIU ..................................................................................................... 81.1 Data Confidentiality ............................................................................................ 121.2 User Privacy v Data Privacy ............................................................................. 131.3 Query Assurance ................................................................................................. 171.4 Nhn xt .............................................................................................................. 19

    Chng 2 CC NGHIN CU LIN QUAN ............................................................... 222.1 Khi nim ............................................................................................................ 222.2 Hng tip cn dng ch k in t ................................................................... 232.3 Hng tip cn s dng cu trc d liu c bit ............................................... 252.4 Hng tip cn Challenge Response. .............................................................. 282.5 Hng tip cn da vo c th ca bi ton ..................................................... 302.6 Bo m truy vn cho d liu dng cy .............................................................. 312.7 Nhn xt .............................................................................................................. 33

    Chng 3 D LIU XML ............................................................................................... 353.1 M hnh lu tr ................................................................................................... 353.2 Ch mc cho ti liu XML .................................................................................. 40

    Chng 4 M BO TRUY VN ................................................................................. 424.1 Phng php ....................................................................................................... 424.2 Nested B+ Tree ................................................................................................... 434.3 Tc v chn ......................................................................................................... 454.4 Cc tc v cp nht d liu ................................................................................. 49

    Chng 5 PHN TCH .................................................................................................... 51Chng 6 THC NGHIM ............................................................................................. 58Chng 7 KT LUN ...................................................................................................... 63Chng 8 PH LC ......................................................................................................... 67

    8.1 Cu trc lu tr XML ......................................................................................... 678.2 Gii thut gn nhn (labeling) ............................................................................. 678.3 Chng trnh th nghim .................................................................................... 688.4 Lc ti liu mondial.xml .............................................................................. 718.5 K hoch thc thi truy vn .................................................................................. 728.6 Tm lc cc nghin cu lin quan .................................................................... 738.7 Bi bo lin quan ................................................................................................ 83

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 8/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 1

    GII THIU

    Thng tin l mt ngun ti nguyn rt quan trng trong mi t chc. Qun l v x l

    thng tin hiu qu v ang tp trung s quan tm ca mi ngi. Vi s ra i ca

    my tnh in t (eclectronic computer) v cc my tnh c nhn (personal computer

    PC), ngnh khoa hc my tnh mang n k nguyn mi, k nguyn ca thng

    tin, tc ng mnh m n mi lnh vc trong i sng.

    D liu c lu tr thnh cc cc c s d liu (CSDL), thng thng, c t

    trong ni b t chc (in-house database). iu ny i hi mi t chc phi u t

    mt khon chi ph cho vic qun l h thng CSDL, bao gm: thit b phn cng

    (my mc, h thng mng), phn mm (h qun tr CSDL DBMS, cc chng trnh

    ng dng c th,), nhn s (nhn vin qun tr mng, nhn vin qun tr CSDL,).

    Cng vi s pht trin ca x hi ni chung v t chc ni ring, nhu cu lu tr v

    x l ngy cng gia tng v phc tp hn. Nhng yu cu ny lm tng tng chi ph

    trong qun l. Mc d, gi thnh phn cng gim rt nhiu, nhng chi ph bn

    quyn phn mm, chi ph cho i ng nhn vin qun tr c trnh cao qun l

    cc h thng thng tin ngy mt phc tp tht s l mt vn ng quan tm trong

    tng chi ph s hu (total cost of ownership) ca t chc. iu ny c bit quan

    trng i vi cc t chc va v nh, t chc phi li nhun,

    Trong nhng nm gn y, s tin b vt bc trong cng ngh mng v truyn thng

    cho ra i h thng mng tc cao, bng thng rng, khai sinh ra khi nim

    application as a service. Ngi dng ch cn phi tr mt khon ph nh cho nh

    cung cp dch v l c th s dng c cc phn mm mi m khng cn phi quan

    tm n chi ph bn quyn, chi ph ci t v bo tr h thng.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 9/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Bn cnh , mt dch v khc cng dn c hnh thnh, l database as a

    service, cung cp cho ngi dng ni lu tr v truy xut d liu ch vi mt chi ph

    thp, m khng cn phi mua sm thit b, cng nh i hi phi c i ng chuyn

    trch. iu ny s gip gim ng k chi ph qun l thng tin cho cc t chc.

    Hnh 1.1. M hnh Database as a Service.

    Trong m hnh database as a service, ngi s hu d liu (data owner DO) t CSDL ca mnh ti nh cung cp dch v (service provider SP) cho cc khch hng (clients, queriers C, Q) thc hin cc tc v trn CSDL nh select, insert update. M hnh cn c gi l outsourced database services (ODBS).

    Thng tin l ti sn quan trng ca t chc. Vic t CSDL lu tr cc thng tin

    mt ni khng tin cy bn ngoi t chc (nh cung cp dch v) lm ny sinh cc

    vn bo mt. Chnh nhng vn ny s quyt nh tnh kh thi ca Dch v CSDL

    outsource (outsourced database services ODBS). Cc CSDL outsourced phi c

    m bo an ton, ngn cm s truy cp ca cc t chc/c nhn khng c thNm quyn,

    k c nh cung cp dch v. Khi , chnh nh cung cp dch v tr thnh i tng

    nguy him nht trong vic m bo bo mt ca d liu. Do cc xm nhp t bn

    ngoi, cao nht, cng ch t c kh nng truy cp h thng nh cc nh cung cp

    dch v. V vy cc nghin cu ch yu tp trung vo vic ngn chn hnh vi xm

    nhp ca chnh cc nh cung cp dch v (service provider SP).

    V mt c bn, vn bo mt CSDL ti cc SP c th chia thnh bn lnh vc nh

    sau [1]:

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 10/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Data confidentiality tnh ni b ca d liu. Ch s hu d liu (data owner

    DO) khng mun nhng ngi khc khng c thNm

    quyn c kh nng truy cp CSDL ca mnh, k c cc

    SP.

    User privacy tnh ring t ca ngi dng. Thng tin l hng ha.

    Do n c th s c bn cho cc cng ty khc. Cc

    cng ty khch hng khng mun l nhng thng tin

    m h khai thc, k c i vi DO v SP.

    Data privacy tnh bo mt d liu. DO khng mun khch hng ca

    mnh c th khai thc c nhiu hn nhng thng tin

    m h c php khai thc.

    Query Assurance tnh bo m truy vn. Khch hng (Client) phi c

    m bo ra d liu m mnh nhn c l chnh xc,

    y v mi nht t CSDL nguyn thy do DO cung

    cp, m khng b nhng thay i ngoi mun.

    Bng 1.1. Cc vn bo mt trong ODBS.

    Song song vi vic m bo cc yu cu bo mt, ta cn phi quan tm n hiu nng

    thc hin truy vn (performance) cng nhng kh nng m rng ca CSDL

    (scalability, usability).

    m bo data confidentiality, d liu c m ha trc khi c outsourced.

    Tuy nhin iu ny lm tng tnh phc tp ca vic x l cc truy vn trn d liu m

    ha m vn phi m bo cc yu cu bo mt khc.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 11/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Hnh 1.2. M hnh ODBS

    Trong m hnh ODBS, data owner t CSDL ca mnh ti cc server bn ngoi (SP) v thc hin truy vn lu tr thng qua ng truyn mng bo mt. Clients tr chi ph cho data owner c quyn truy cp d liu, v thc hin truy cp d liu trc tip t SP cng thng qua ng truyn bo mt.

    Trong thc t, khng phi lc no cng cn thit phi m bo tt c cc yu cu bo

    mt trn. Ty thuc vo tnh hung m mt s yu cu c th c b qua nhm gim

    thiu mc phc tp tng hiu nng x l ca h thng. Quay tr li m hnh ca

    ODBS, ta c bn m hnh bo mt nh sau [1].

    - M hnh UP-DP (User privacy Data privacy): trong m hnh ny DO ng

    thi l ngi cung cp dch v SP. DO bn thng tin t CSDL ca mnh cho

    cc khch hng khc. y chnh l m hnh CSDL in-house truyn thng.

    Do , m hnh ny ch quan tm n user privacy v data privacy.

    - M hnh UP-nDP (User privacy non Data privacy): m hnh ny tng t

    m hnh trn, ch khc l d liu c bn l ph bin, khng cn phi bo mt

    d liu. Ch cn che du nhng g m ngi dng ly t CSDL.

    - M hnh DC-UP (Data confidentiality User privacy): trong m hnh ny DO

    ng thi l khch hng duy nht ca h thng. y l m hnh kh ph bin.

    Cng ty thu nh cung cp dch v lu tr d liu ni b ca mnh v thc hin

    truy cp trn CSDL ny. Do , ch xem xt confidentiality v user privacy.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 12/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - M hnh DC-UP-DP: y l m hnh y v phc tp nht. DO thu nh

    cung cp dch v lu tr CSDL ca mnh. ng thi, thc hin bn thng tin

    cho cc khch hng khc. DO cn c m bo data confidentiality v data

    privacy trong khi ngi dng cn c m bo user privacy.

    Trong tt c cc m hnh trn, query assurance lun l mt vn cn c quan

    tm v xem xt.

    Phn tip theo im qua cc nghin cu cng nh cc kt qu lin quan n cc vn

    bo mt trong bng 1.

    1.1 Data Confidentiality Data Confidentiality l yu cu m bo CSDL khng b truy cp bt hp php, k c

    cc SP. t c yu cu ny, CSDL thng c m ha trc khi outsourced.

    Tuy nhin, chnh vic m ha ny lm gia tng s phc tp trong truy vn d liu, nh

    hng rt nhiu n hiu nng ca CSDL. Vic la chn c ch m ha c th dung

    ha gia nhu cu bo mt v yu cu v hiu nng l rt cn thit. Hin nay, c ch

    m ha kha b mt i xng (symmetric private key encryption) thng c s

    dng, chng hn nh gii thut Rijndael, DES, TripleDES,....

    Thc thi truy vn trn d liu m ha Hacigm [5] xut mt gii php thc thi cy truy vn trn d liu m ha.

    tng chnh ca gii php ny l tch cu truy vn thnh hai phn: mt phn s c

    thc thi ti server, phn cn li s c thc thi ti client.

    Kenny C.K.Fong [6] ch ra nm l hng bo mt nghim trng ca phng php

    ny:

    1. Khng tha mn tnh bo mt v mt ng ngha (semantically secure). Tnh

    cht ny khng tha mn nu ta c th tm c hai thng ip m0 v m1 m c

    th on c kt qu m ha l ca m0 hay m1 vi xc sut > .

    2. N u min gi tr ca cc trng d liu nh v ri rc, th hm bm ca gii

    thut khng m bo an ton.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 13/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    3. Gii thut cha m bo tnh xc thc ca kt qu truy vn tr v.

    4. Cha che du cu truy vn mt cch hon ho. Server c th on bit c

    loi truy vn m ngi dng thc hin.

    5. Thc hin m ha theo record, do , phi gii m theo record. V vy, ngi

    dng c th bit nhiu thng tin hn l h c php (khng tha mn data

    privacy).

    Tm kim d liu m ha trn d liu XML R. Brinkman [9] gii thiu mt cch thc cho php tm kim so trng cc tag ca mt

    ti liu XML c m ha da trn gii thut Linear Search Strategy for Full Text

    Documents (1), gi l Tree Search Strategy for XML Documents (2).

    Gii thut (1) chia lm 3 giai on: lu tr (storage), tm kim (search), nhn d liu

    (retrieval). giai on lu tr, ton b d liu c chia thnh nhiu khi nh c

    nh, sau thc hin m ha cc khi ny trc khi lu tr trn server. DO cn phi

    ghi nhn mt s thng tin v m ha c th gii m sau ny. Do , gii thut ny

    ch ph hp vi m hnh DP-UP. giai on tm kim, chui d liu cn tm s c

    m ha v chuyn n cho server so trng trn cc khi d liu xc nh ra v tr

    ca on d liu m ha kt qu. Giai on nhn d liu, kt qu m ha s c gii

    m da theo cc thng tin m ha c ghi nhn ti giai on lu tr.

    Gii thut (2) c xy dng da trn gii thut (1). Tuy nhin, d liu l mt ti liu

    XML thay v file text phi cu trc. Kch thc ca cc khi chia ra cng khng u

    nhau m phc thuc vo kch thc ca tng node (hay mi node l mt khi). (2) ch

    p ng cc cu truy vn dng tm kim so trng cc tag name trong ti liu XML m

    khng x l n ni dung d liu bn trong node.

    1.2 User Privacy v Data Privacy N gi s dng CSDL yu cu h thng phi m bo user privacy/data privacy v

    yu cu truy vn cng nh kt qu tr v. SP, v k c DO, khng c php bit cc

    thng tin ny.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 14/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Mt khc, ngi dng ch c php truy vn nhng g m h c php. Kt qu tr

    v ch gii hn trong phm vi thng tin m h yu cu. N gi dng khng c php

    thy cc d liu khng thuc thNm quyn ca mnh.

    PIR-like protocols t c user-privacy, Chor gii thiu giao thc PIR (private information

    retrieval). V mt l thuyt, PIR cho php ngi dng c th che du cu truy vn v

    kt qu tr v. Tuy nhin, CSDL cn phi c nhn bn (replicate) sang nhiu ni.

    N u khng, chi ph phi tr l rt ln (c th cn ly v ton b CSDL ti client). Mc

    d vy, ngay c khi c nhn bn, chi ph phi tr cng ln PIR khng th

    p dng vo thc t.

    PIR ch dng truy xut d liu ch c (read-only). N otably v Ostrovsky pht

    trin gii thut PIR cho php h tr cc thao tc cp nht d liu m bo user

    privacy, giao thc PIS (private information storage).

    ng dng PIR/PIS vo thc t, Asonov ci tin PIR thnh giao thc RIR

    (repudiative information retrieval). RIR gim bt mt s rng buc bo mt gim

    bt chi ph I/O m vn m bo user privacy. Tng t, giao thc RIS ci tin t PIS

    vi chi ph thp v kh thi hn.

    Tt c cc giao thc trn u c xy dng da trn nn tng ca PIR, do , chng

    cn c gi l cc giao thc h PIR (PIR-like protocols). Tuy nhin, cc giao thc

    ny ch h tr user privacy m khng m bo data privacy. Gertner pht trin

    mt giao thc xy dng trn mt giao thc h PIR bt k cho php tha mn c hai

    yu cu data privacy v user privacy, gi l SPIR (symmetrically private information

    retrieval).

    D liu dng cy (tree-structured data) Lin v Candan [2] ra mt gii thut cho php ngi dng c th che du d liu v

    cc truy vn trn d liu dng cy. Lin v Candan a ra hai k thut: redundancy

    access v node swapping.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 15/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - Redundancy access: khi ngi dng truy cp mt node d liu, h thng tr v

    m node, trong m-1 node l ngu nhin hn ch khng cho server c th

    bit c ngi dng thc s truy cp vo node no. Tuy nhin, nu mt node

    c truy cp thng xuyn, node root, server c th giao cc redandancy set

    pht hin ra node ny.

    - Node swapping: mt node sau khi c truy xut s c hon chuyn sang

    mt node khc. lm c iu ny, trong m-1 node ngu nhin c mt node

    trng (empty node), node cn c s c hon chuyn vi node trng ny v

    cp nht xung CSDL. K thut ny gii quyt c vn m redundancy

    access gp phi.

    [2] cng ch ra nm vn cn gii quyt v gii php cho chng:

    1. Qun l danh sch cc empty nodes. Bng cch s dng mt node c bit

    snode qun l cc eheads, etails bit c danh sch empty nodes.

    2. Phng php chn ngu nhin cc node cho redundancy set.

    3. m bo tnh ton vn ca mi quan h cha-con ca node b hon chuyn. Lin

    v Candan ra hai gii php: (1) xc nh empty node s hon chuyn node

    cha v cp nht lin kt vo node cha trc khi c node con ln. (2) ghi nh

    ng dn cc node t root n node cn truy cp, sau mi thc hin hon

    chuyn t di ln. [2] cng ch ra rng: gii php (1) l kh thi trong khi (2) l

    khng kh thi.

    4. Cc vn khi c s truy cp d liu ng thi. Trong [2], tc gi ra gii

    php kha cc node khi truy cp. T , a ra nhng iu chnh m bo gii

    thut khng b deadlock.

    5. Vic chn gi tr cc thng s bo mt nh th no l hp l (m kch thc

    ca cc tp d tha, redundancy set, s kch thc ca node).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 16/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Lin v Candan xy dng gii thut oblivious traversal algorithm, v chng

    minh trong [2], nu tn sut truy cp ca cc node c phn b u (uniform

    distribution) th gii thut ny t c user privac.

    Trong trng hp tn sut cc node khng u nhau (iu ny thng gp trong thc

    t), Lin v Candan cng nu ra mt s gii php trong [4]: dummy node access,

    replicate frequently accessed nodes, clique approach. Tuy nhin, [4] cng nu ra cc

    yu im ca tng gii php nh sau: s lng dummy node access, s lng

    replicated nodes, kch thc redundancy set phi kh ln.

    T , [4] xy dng mt gii php t c tnh privacy trong trng hp tn sut

    truy cp khng u m trnh c cc hn ch trn, c gi l clustering node

    acceses into uniform chains.

    Hai giao thc extreme protocols Gii php ca Lin v Candan trong [2] ch ph hp cho m hnh UC-UP. ng thi,

    vn cn tn ti mt s gii hn [3, 22]:

    - Cha ch ra r rng cch thc cp nht danh sch cc empty nodes, ng thi

    cng cha ch ra c vic tn dng li cc empty node.

    - Gii thut ny khng h tr cc thao tc insert, delete mt cch trc tip. c

    bit l khi xy ra trng hp over-full v under-full i vi cc node.

    - Redundancy set ch c mt empty node nn khng h tr c khi xy ra over-

    full v under-full.

    [1] ra hai extreme protocol gii quyt cho hai m hnh DC-UP v DC-UP-DP.

    M hnh DC-UP

    DC + UP = Encryption + PIR protocol

    Trong trng hp cn cp nht CSDL, PIR protocol c thay bng PIS protocol. V

    m bo tnh thc thi, PIR/PIS protocol c thay bng RIR/RIS protocol.

    M hnh DC-UP-DP

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 17/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    [1] xut s dng K nh l mt t chc ng tin cy th 3 (trusted third-party) lm

    cu ni gia khch hng v nh cung cp dch v. Khi m hnh DC-UP-DP quay

    tr li m hnh DC-UP uc gii quyt trc . Tuy nhin, do thng tin l mt

    vn ht sc nhy cm, nn tm c mt t chc nh th ny trong thc t l mt

    iu ht sc kh khn.

    Oblivious operations on dynamic outsourced search trees N h trnh by, gii thut ca Lin v Candan trong [2] khng h tr cc thao tc

    insert/delete v mt s gii hn ca gii thut. [3, 22]

    [3, 22] cch gii quyt cc gii hn ca gii thut ca Lin v Candan. ng

    thi, cng ra gii thut h tr thao tc insert/delete.

    Tuy nhin, gii thut oblivious insert ch h tr B+-tree, m cha h tr cc cu trc

    cy c bit khc nh: SH-trees, UB-trees, R+-trees, rd-trees [3, 22].Gii thut

    oblivious delete c th c m rng h tr cc cy c bit ny, tuy nhin trong

    trng hp cu trc cy chp nhn node under-full th gii thut khng th p dng.

    Mt iu chnh ca gii thut ny c th gii quyt tt vn under-full node tuy

    nhin ch c p dng cho B+-tree.

    1.3 Query Assurance Query Assurance m bo kt qu truy vn tr v t server l ng (correctness) v

    y (completeness) v mi nht (freshness).

    - Tnh ng l cc kt qu tr v l chnh xc c ly t CSDL hay c dn

    xut t (trung bnh, tng,) m khng b thay i.

    - Tnh m bo kt qu truy vn tr v l y , khng b b st v mt

    nguyn nhn no (do server thc hin khng ht cu truy vn, hoc khng

    tr v y tp kt qu, hay do tht lc trn ng truyn).

    - Tnh mi m bo kt qu tr v t server l d liu mi nht c cp nht t

    cc DO. Tnh mi thng c quan tm trong trng hp d liu outsource

    c th c thay i.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 18/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Einar Mykletun [7] ra mt gii php m bo tnh ng cho cc cu truy vn

    dng ch c (read-only) v khng c tnh ton gp (nh SUM, AVERAGE,). Mi

    dng d liu (record) c lu km theo ch k in t ca dng . Kt qu tr v

    km theo vi ch k in t. Client kim tra ni dung d liu vi ch k km theo

    xc nhn c tnh ng ca d liu. Tuy nhin, do s lng record tr v c th ln,

    v vy vic kim tra mt s lng ln ch k in t cho tng dng dn n lng ph

    thi gian v l mt chi ph nng n cho client. gii quyt vn ny, [7] ngh

    m hnh Condensed-RSA. Theo , thay v kim tra ring l tng ch k ca tng

    record, client ch cn kim tra tt c cc record cng lc da trn ch k tng hp

    (condensed signature) do server tr v l c th xc nh c tnh ng ca d liu.

    [7, 14] cng nu ra mt gii php khc nhm t c tnh ng l s dng Merkle

    Hash Tree (MHT). MHT l cy m cc l ca n l kt qu bm ca d liu ca tng

    dng tng ng trong CSDL. V nh du node gc bng mt ch k in t chuNn.

    N u km theo hai record hai bin kt qu, ta c th chng minh c kt qu tr v

    y .

    Cu trc MHT i hi phi lu tr km theo mt cu trc d liu chuyn dng

    phc v cho query assurance. Mi cu trc ny thng ch p dng cho mt thuc

    tnh, nh vy, trong trng hp CSDL c nhiu thuc tnh dng tm (searchable

    attribute) i hi nhiu cu trc tng ng, iu ny c th lm tng ph tn lu

    tr ti server. Maithili N arasimha [10, 21] ngh mt hng tip cn mi da

    trn chui ch k in t. Khi , trong ch k ca mt record c bao gm ni dung

    ca record lin trc n (c sp xp theo mt thuc tnh cho trc). N h vy, to

    thnh mt chui lin tip nhau. Trong kt qu tr v, server tr km thm hai record

    bin c th m bo c tnh ng v y . Hng tip cn ca [10, 21] khng

    i hi phi tn thm nhiu khng gian lu tr trn server. Mi dng d liu ch cn

    lu thm mt ch k. ln ca mt ch k thng thng l 128 byte, i vi RSA

    (64 byte cho BGLS).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 19/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Tuy nhin, chi ph xy dng, to cc ch k v kim tra cc ch k i khi cng ng

    k, thng chm hn t 100 1,000 ln so vi vic bm (hashing). [15] xut gii

    php da trn Embedded Merkle B-tree (EMB) cho php m bo tnh ng, y

    v mi. Vic m bo truy vn ch yu da vo cc php bm. T , c th gim bt

    thi gian thc hin tnh ton ch k khi CSDL c thay i cng nh thi gian kim

    tra kt qu tr v. [15] ng thi cng l gii php u tin gii quyt c y cc

    vn ca query assurance.

    Radu Sion [8] a ra mt hng tip cn mi cho php m bo tnh y i vi

    kt qu tr v t mt tp cc cu truy vn cn c thc hin (batch of queries).

    Hng tip cn ny xy dng mt giao thc da trn vic m rng giao thc ringer.

    Da trn cc challenge-token, gi km theo, mt cch ngu nhin, xen k vi cc cu

    truy vn cn thc hin, client bit trc kt qu ca nhng cu truy vn ny v so

    snh n vi kt qu tr v t server. N u trng khp th m bo kt qu tr v t

    server y .

    1.4 Nhn xt Phn trn ca ti liu trnh by mt cch tng quan nhng nghin cu v nhng kt

    qu hin ti trong ODBS. Cc kt qu ny c tm tt theo dng cy phn ph lc.

    Qua , ta c th rt ra mt s nhn xt nh sau:

    - Vic m bo data confidentiality c th d dng t c bng cch m ha

    d liu trc khi thc hin outsourced.

    - Vic m bo user privacy/data privacy trn d liu m ha c rt nhiu

    nghin cu trn cc dng d liu khc nhau (XML, RDB) v t c

    nhng kt qu rt kh quan c th ng dng c vo thc t.

    - Vic m bo query assurance trn ODB. D hin nay c nhiu nghin cu

    nhm m bo query assurance tuy nhin kt qu t c vn cn mc hn

    ch so vi cc kt qu t c cc lnh vc khc. Cc nghin cu hin nay

    c th p ng c tnh ng, tnh y v tnh mi trong vic thc hin

    cc cu truy vn. Tuy nhin, hu nh cc cch tip cn hin ti vn cha cp

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 20/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    trc tip n vn query assurance trn CSDL XML. Do tnh cht c th

    ca mnh, CSDL XML i hi cn phi c mt s iu chnh c th m

    bo query assurance.

    Qua nhng ni dung tm hiu trn, chng ti c ra mt s hng nghin cu

    tip theo nh sau:

    1. Cc n lc nghin cu nhm xy dng cc giao thc cho php che du ngi

    dng trong vic khai thc thng tin (nh danh, truy vn ci g, uc tr ci g)

    i ngc li vi nguyn tc khng th ph nh trong cc h thng. c bit l

    i vi cc h thng thng tin mt, c tnh nhy cm cao, c th to c s cho

    cc ti phm tin hc c nhng hnh ng xu. N n chng xy dng mt giao

    thc va m bo tnh ring t m vn c th, khi cn thit, chng thc c ai

    ly thng tin g? [1]

    2. Hu ht cc nghin cu hin nay ch tp trung gii quyt cho m hnh DC-UP.

    Mc d [1] trnh by mt giao thc ton din (extreme protocol) h tr m

    hnh DC-UP-DP, nhng giao thc ny i hi phi c mt ngi trung gian tin

    cy (trusted third-party server) K c th chuyn i m hnh DC-UP-DP

    sang tr li DP-UP. Vic nghin cu loi b K cng l mt vn ng

    c quan tm. [1]

    3. Cc gii thut h tr oblivious operation (insert/delete) vn cn mt s im

    cha hon thin. Gii thut insert cha h tr cy c bit nh: SH-tree, UB-

    tree, R+-tree, kd-tree. Gii thut delete ban u c th h tr cc cy ny, tuy

    nhin nu cu trc cy cho php cc under-full node, th phin bn hiu chnh

    ca n ch h tr B+-tree [3, 22]. Mt vn khc cn quan tm l cc gii

    thut ny c chng minh l m bo tnh privacy trong trng hp tn sut

    truy cp cc node l phn b u. Trong khi, th gii thc, tn sut ny l

    khng u v c s chnh lch kh ln v tn sut gia cc node [4].

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 21/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    4. Chin lc thc thi cu SQL trn d liu m ha ca Hacigm vn cn nhiu

    l hng bo mt [6]. Vic nghin cu v khc phc vn ny vn l mt iu

    ng c quan tm.

    5. m bo tnh query assurance i vi kt qu tr v t server mi ch dng

    mc x l cc cu truy vn n gin [7, 8] v ch h tr m hnh DC-UP.

    c th thc thi cc cu truy vn cp nht d liu v cc cu truy vn phc tp

    hn i hi phi cng sc hn na [8].

    Trong cc hng nghin cu va nu: (1) c nghin cu v pht trin rt nhiu,

    p dng tt vo thc t. (2) l mt vn rt kh khn, hin ti hu ht cc giao

    thc u trnh m hnh ny do tnh phc tp ca n. Xt thy trong thi gian gii hn,

    cng nh vn cn thiu cc kin thc cn thit v bo mt v ODB; mt khc vn

    ny hu nh khng lin quan n CSDL XML nh nu trong ti nn ti liu ny

    s khng cp n. (3) (4) ch l nhng kha cnh rt nh v hu nh c gii

    quyt. (5), nh trnh by, tuy c nhiu nghin cu nhng kt qu t c vn

    cn nhiu hn ch, mt khc cha c nghin cu no v query assurance lin quan

    n CSDL XML.

    N h vy, trong phm vi ca mnh, ti liu ny trnh by mt hng tip cn nhm

    gii quyt vn query assurance trong CSDL XML. Phn tip theo ca ti liu s

    trnh by chi tit hn v cc kt qu lin quan trong lnh vc query assurance.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 22/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 2

    CC NGHIN CU LIN QUAN

    2.1 Khi nim N h trnh by, query assurance l mt yu cu bo mt cn c quan tm trong

    hu ht tt c cc m hnh ca ODBS. Query assurance c th c nh ngha thng

    qua ba tnh cht cn c tha mn, bao gm: tnh ng, tnh y v tnh mi.

    Vic gii quyt trit cc vn ca query assurance vn cn l mt bi ton kh,

    i hi nhiu cng sc hn na. Hin nay, hnh thnh cc hng tip cn khc nhau

    gii quyt vn ny, bao gm:

    - S dng ch k in t (digital signature) chng thc tng dng d liu tr

    v t server l ng n, khng b thay i bi server hay thay i trn ng

    truyn [7]. Phng php ny ch c th m bo tnh ng m khng th m

    bo hai yu cu cn li. Mt s nghin cu gn y m rng vic s dng

    ch k in t th gii quyt c tnh y [10, 21]. Tuy nhin, n vn

    cha th gii quyt c yu cu th ba (tnh mi) ca query assurance.

    - S dng cc cu trc d liu chuyn bit gii quyt c bi ton v tnh

    ng v tnh y . Merkle Hash Tree (MHT) l mt cu trc kh in hnh

    ca khuynh hng ny [11, 14].

    - p dng mt s kt qu trong bi ton tnh ton phn b [13] gii quyt cc

    yu cu t ra. Mt kt qu ca hng ny l s dng m hnh challenge-

    response m bo tnh y trong vic x l tp cc cu truy vn (batch of

    queries). u im ca hng tip cn ny l c th x l c cho dng cu

    truy vn bt k [8]. Tuy nhin, hng tip cn ny, nh trnh by, ch c th

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 23/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    p dng cho vic x l tp cc cu truy vn do khng ph hp trong vic x

    l cc cu truy vn n l.

    - Da vo tnh c th ca d liu, cng nh tnh chuyn bit ca cc cu truy

    vn pht trin mt giao thc ring gii quyt cc yu cu ca query

    assurance. Truy vn d liu m ha da vo cc t kha (keyword) l mt

    v d [12].

    Phn tip theo ca ti liu s trnh by chi tit hn v cc hng tip cn ny.

    2.2 Hng tip cn dng ch k in t Vic s dng ch k in t chng minh tnh ng ca d liu l mt gii php

    ang c s dng hin nay [7]. Trong m hnh unified client model, do ch c duy

    nht mt Client ng thi l DO, nn vic s dng ch k in t c th c thay

    th bng hm bm mt chiu khng th o trong thi gian tuyn tnh [8].

    Vic chng thc d liu c th c thc hin nhiu cp khc nhau, granularity.

    C th thc hin chng thc trn mt bng (ton b quan h), mt ct (thuc tnh ca

    quan h) hay mt dng d liu (record). Vic chng thc cp bng i hi ton

    b d liu ca bng phi c tr v mi c th thc hin chng thc c. iu ny

    l khng th kh thi, v hu ht cc cu truy vn d liu ch tr v mt phn (mt s

    dng) ca bng m thi. iu ny cng xy ra tng t nu thc hin vic chng thc

    cp ct. V vy, vic chng thc cp dng c th c xem l mt chn la

    tt nht1. N h vy, mi dng, ngoi cc d liu ca quan h, cn cn c lu tr

    thm thng tin v ch k ca dng ny.

    Vic thit k mt giao thc chng thc cn phi ch n cc yu t sau [7]:

    - Tnh ton ti client: chi ph tnh ton ti client xc nh tnh ng ca dng

    d liu.

    - Bng thng ng truyn n client.

    1 Mt gii php khc l s dng vic chng thc cp trng (field). Tuy nhin, iu ny s dn n s qu ti v mt lu tr cng qu ti v tnh ton trong vic chng thc (do thi gian kim tra ch k in t cng kh ln).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 24/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - Tnh ton ti server: bao gm vic truy sut, tr v cc thng tin dng kim

    tra tr v cho cu truy vn.

    - Tnh ton i vi Data Owner: chi ph tnh ton cc thng tin dng kim tra

    trc khi lu tr vo CSDL.

    - Yu cu khng gian lu tr trn server.

    Trong , ba yu t u l cn c ch nhiu hn c [7]. Tuy nhin, i vi CSDL

    ng (dynamic outsourced database) th cng cn thit phi xem xt n yu t th t

    khi thc hin cp nht d liu. Yu t th 5 hu nh khng qu quan trng do cc

    thit b dung lng ln ngy cng r.

    M hnh ch k in t c chn ph bin hin nay l RSA vi chiu di ca ch k

    l 1024 bit (theo nh gi th RSA 1024 c th an ton trong vi thp k ti). Tuy

    nhin trong trng hp s lng dng d liu tr v ln th dn n vic lng ph v

    mt bandwidth cng nh thi gian tnh ton ti server chng thc d liu. Mt gii

    php c p dng l s dng m hnh Condensed-RSA. Condensed-RSA l mt m

    hnh ch k in t bao gp. Gi s c tp t message {m1,,mt} vi tp ch k tng

    ng {1,, t), ch k Condensed-RSA c tnh bi: 1,t = i i (mod n) , i = 1..t

    Khi vic kim chng ch k 1,t tng ng vi vic kim chng t cha k i ring l. Mt li im khc l kch thc ca Condensed-RSA bng vi kch thc

    ca mt RSA chuNn. N h vy, thay v tr v ton b cc ch k ca tng dng ring

    l, server ch cn tnh ton ch k Condensed-RSA v tr v cho client c th thc

    hin vic chng thc d liu.

    Maithili v G.Tsudik [10, 21] a ra mt hng tip cn mi m bo tnh an ton v

    hiu qu cho cc cu truy vn c s m khng i hi thm bt k mt cu trc d

    liu phc tp no. Hng tip cn ny gi l Digital Signature Aggregation and

    Chaining (DSAC). t c tnh ng, [10, 21] s dng li cch tip cn cp

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 25/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    trong [7]. Do , phn tip theo ca ti liu ch trnh by bin php t c tnh

    y .

    Tnh y Tnh y t c bng cch xy dng mt mi lin kt bo mt gia cc ch k

    ca tng record, gi l signature-chain. Chui lin kt ny t c bng cch thay

    i cch tnh ch k ca tng record nh sau:

    Sign(r) = h(h(r)||h(IPR1(r))|| h(IPRl(r)))SK

    Trong , h() l hm bm m ha (nh SHA), IPRi l record lin k trc dc theo

    chiu i, l l s chiu c th thc hin truy vn, SK l kha ring ca data owner.

    Cc record lin k trc ca mi record c xc nh bng cch sp xp quan h R

    theo cc chiu c th truy vn, nh hnh sau:

    Hnh 2.3. Sp xp quan h R theo cc chiu truy vn.

    Cc record lin k trc ca R5 ln lc l R6, R2, R7. Khi , ch k ca R5 c

    tnh nh sau: Sign(R5) = h(h(R5)||h(R6)||h(R2)||h(R7))SK.

    Cch thc chng minh tnh y , v mt nguyn tc, l tng t nh phng php

    dng trong AuthDS. N gha l, chng minh mt kt qu tr v ca mt cu truy vn,

    server tr v chui ch k hai record bin ca kt qu cng vi cc chui ch k ca

    hai record cn bin kt qu. T c th chng minh c kt qu tr v l y .

    2.3 Hng tip cn s dng cu trc d liu c bit Mt hng tip cn khc nhm tha mn cc yu cu ca Query Assurance l s dng

    cc cu trc d liu c bit lu tr cc thng tin gip cho vic m bo tnh ng

    cng nh tnh y .

    tng ca hng tip cn ny c th hin nh sau [14]:

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 26/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Hnh 2.4. M hnh chng minh truy vn.

    Ch k tng hp (Summary-signature) c tnh ton quy t di ln theo phng

    php bm trn ton b cy ch mc (B-tree) i vi ton b cc record trong mt

    relation. Gi tr ny c k bng sk0. Cc truy vn ca user c publisher thc thi

    tr v kt qu cng vi mt cu trc d liu khc gi l verification-object, c dng

    chng minh l kt qu tr v l ng v y .

    Hng tip cn ny c mt s c tnh nh sau [14]:

    - User ch cn tin cy vo kha pk0 ca owner. Owner ch tnh ton li ch k

    tng hp khi thc hin cc cp nht, thay i trn CSDL. V vy kha ring sk0

    hon ton c th c bo v offline, iu ny trnh c s tn cng t mng.

    N goi ra, cn c th s dng phn cng hin thc kha ny.

    - User khng cn thit phi tin cy cc DO. V vy, khi c s c vi mt

    publisher no , th hu qu ch l mt i dch v cung cp bi publisher ny.

    - Kch thc ca verification-object l tuyn tnh vi kt qu tr v ca cu truy

    vn v tng quan logarit vi kch thc ca CSDL.

    - Verification-object m bo rng kt qu tr li l chnh xc v y .

    - Chi ph tnh ton ch k tng hp, verification-object (VO) v kim tra VO l

    chp nhn c.

    Mt cu trc in hnh ca hng tip cn ny l Merkle Hash Tree (MHT) [11]. Cy

    MHT c xy dng da trn tp gi tr x1, x2,, xn c sp th t ca mt thuc

    tnh trong mt quan h. Mi l ca cy c mi lin kt vi mt gi tr xi s cha gi tr

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 27/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    h(xi), trong , h() l hm bm mt chiu, chng hn nh MD5, SHA-1. Cc node

    trong ca cy s cha gi tr bm ca hp tt c cc gi tr ca cc node con ca n.

    Gi s v c hai node con l v1 v v2, th gi tr ca v l h(v1||v2). Cui cng, gi tr ti

    node root s c xc thc bi ch k in t.

    Tnh ng (correctness) chng minh tnh ng ca kt qu truy vn, server tr v VO cha co-path ca

    node tr v. co-path ca mt node l tp cc node khc t c th tnh ton c

    gi tr ca node root. Do ni dung ca root c k, nn so snh vi kt qu tnh

    c, server c th chng minh c cu tr li ca mnh l ng. cy MHT di

    y, khi kt qu truy vn node 5, server s tr v thm node h1 v h34. T hai node ny

    ta c th d dng tnh c root nh hnh v sau.

    h1 h2 h3 h43 5 6 9

    h34 = h(h3||h4)h12 = h(h1||h2)

    root = h(h12||h34)

    Hnh 2.5. Binary Merkle Hash Tree.

    Trong kt qu tr v {5}, server tr km thm {h1, h34, sign(root)}. Nh vy, client c th tnh c h12 = h(h1||h{5}); root = h(h12||h34). So snh root vi ch k ca root, client c m bo kt qu tr v l ng.

    Tnh y

    Trc tin ta xt trng hp server tr li cu truy vn l khng c mt reocrd no

    trong CSDL tha iu kin truy vn. Khi , server phi chng minh c iu ny,

    gi l cc empty proofs. iu ny c th thc hin bng cch tr v co-path ca hai

    node k nhau sao cho khong tr cn truy vn nm trong khong gi tr ca cc node

    ny.

    Tnh y ca cu tr li t c bng cch gi km theo cc empty proofs cho cc

    node l ln cn hai node bin nm trong kt qu tr v ca cu truy vn.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 28/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Mt gii hn ln ca AuthDS l i hi phi bo tr mt cu trc d liu phc tp bn

    cnh d liu thc s. Cu trc ny cn phi c tnh ton y trc khi a ln

    server. Mi thay i cp nht d liu i hi phi tn chi ph khng nh cp nht

    li cc s liu trong cu trc [8, 10, 21]. Bn cnh , c th m bo tnh ng

    cng nh y ca cy truy vn theo khong (range-query) i hi phi xy dng

    mt cu trc cho tng thuc tnh, theo tng trt t sp xp (sort-order) [10, 21].

    2.4 Hng tip cn Challenge Response. Radu Sion [8] a ra mt giao thc m bo tnh ng v tnh y ca cc

    cu truy vn dng bt k da trn vic m rng giao thc ringer trong tnh ton phn

    b (distributed computation).

    Giao thc ringer trong tnh ton phn b c a ra trnh gian ln trong vic tnh

    ton cc bi ton con. Giao thc ringer c nhiu bin th nh basic ringer, bogus

    ringer v hybrid ringer (magic ringer). Xt bi ton sau: tm ra chui text ban u t

    chui text m ha bng gii thut DES. Cc trm lm vic (working station) s pht

    sinh ra cc chui bng phng php t hp, sau p dng gii thut DES trn chui

    ny, nu kt qu trng khp vi chui m ha th chui pht sinh chnh l chui text

    ban u cn tm. tng ca basic ringer trong [13] c th c tm tt nh sau:

    - supervisor chn ra mt gi tr ngu nhin xi trong min tr Di m trm lm

    vic i tnh ton trn n, sau tnh yi = DES(xi). Sau , supervisor gi cho

    trm i gi tr yi v y. Trong , y l chui m ha cn gii m.

    - Trm i nhn c min tr Di v thc hin tnh ton, nu vic tnh ton l

    hon chnh (complete) th chc chn trm i s tm ra c gi tr ca xi hay x (y

    = DES(x)) . Khi tr kt qu v cho supervisor, l xi (c th c c x). Khi

    supervisor c th bit c thc s trm i thc hin y cng vic.

    hin thc, [8] a ra mt cch t chc d liu nh sau. Gi s S l d liu

    outsource. S c phn thnh nhiu on Si, mi Si s c xc nh bi mt hm

    bm dng m bo d liu l chnh xc, khng b thay i. Gi tr ny gi l

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 29/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    identity-hash, c s dng chng thc cc cu truy vn identity query, l

    nhng cu truy vn tr v ton b d liu trong Si.

    Qu trnh thc thi cc cu truy vn nh sau:

    - Trong tp cc query Q {Q1, Q2, Q3, .. Qa} cn thc thi, querier s chn vo cu

    query Qx ti mt v tr bt k, querier bit trc kt qu tr v ca Qx. ng

    thi, querier tnh ton mt challenge token bng {H(||(Qx)), }. Trong , H() l hm m ha mt chiu bt kh o (non-invertible one-way hashing

    function); : l mt gi tr duy nht theo thi gian (timestamp) m bo challenge token l duy nht; (Qx) : l kt qu tr v c bit trc bi querier.

    - N him v ca server l thc thi cc cu query v xc nh c gi tr x bng

    cch p dng hm H() cho cc kt qu. V gi km x v cng vi kt qu ca

    cc truy vn.

    N u ch s dng mt challenge token th c th dn n trng hp server sau khi

    tm c challenge token ri th ngng thc hin cc cu truy vn khc (hoc thc

    hin khng y ). [8] cng trnh by mt s phng php khc phc vn .

    u tin l c th s dng nhiu challenge token thay v mt. Tuy nhin v mt hnh

    thc th cch thc ny thc ra vn khng th gii quyt c vn cn bn. N gha

    l, vn c trng hp server sau khi nhn din c y cc challenge token th

    ngng thc hin cc query cn li. ng thi vic pht sinh nhiu challenge token

    tht s l mt gnh nng tnh ton cho cc thin-client (querier) nh cc thit b di

    ng (mobile client). Mt gii php khc c ra l s dng cc fake token. Fake

    token l mt challenge token gi, ngha l querier pht sinh ra ngu nhin ra mt

    challenge token m khng cn quan tm n kt qu tr v t server. Trong tp cc

    challenge token c gi n server s bao gm r challenge token thc s v f token

    gi. Hai tham s r, f l c th thay i theo tng tp truy vn. V vy, server khng th

    no c th xc nh c ton b s challenge token c gi n. Do , bt buc

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 30/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    server phi thc hin y cc cu query c th tm ra c cc challenge token

    tht s.

    N hng phng php trn ch tp trung cho gii quyt cc cu truy vn c d liu

    (select query) ch cha gii quyt vn cho cc cu truy vn cp nht/thm mi d

    liu (update/insert query). Vic cp nht d liu thc hin bng cch c ton b

    on d liu c cha dng cn update (hay s cha hng insert). Sau thc hin cp

    nht d liu, tnh ton li identity hash ri cp nht tr li server. Tuy nhin, vic

    x l tnh hung cp nht d liu vn ch l bc khi u [8].

    2.5 Hng tip cn da vo c th ca bi ton D dng nhn ra rng, c th xy dng c mt giao thc m bo cc yu cu

    ca Query Assurance trong trng hp tng qut l mt cng vic cc k kh khn.

    Do vy, mt hng tip cn mi gii quyt bi ton ny mt cch tng i hon

    chnh l i vo gii quyt n trong tng trng hp d liu, truy vn c th..

    Radu Sion, Bogdan Carbunar [12] trnh by mt giao thc dng truy vn cc d

    liu m ha da trn t kha c th m bo cc yu cu privacy cng nh query

    assurance.

    Cc ti liu c lu thnh nhng vng ring l (file), mi ti liu s c mt s lng

    t kha (keyword) nht nh. S cc t kha cho mi ti liu c lit k trc khi

    c a ln server. Bi ton t ra l c th thc hin truy vn ti liu da v mt

    hay nhiu t kha cho trc.

    Mi ti liu c gn vi mt con s nh danh di ngu nhin, duy nht, khng lin

    quan n ni dung ca ti liu . Do s lng t kha c xc nh trc, ng

    vi mi t kha ny, xy dng mt tp cc nh danh ca ti liu c cha t kha, gi

    l KDS (keyword document sets). Cc KDS c th c t ti chnh cc querier. Hay

    c th c t server. Lc ny, cc KDS cn c m ha v c chng thc.

    Mi KDS c th c m ha bi mt kha khc nhau. Kha ny c th c tnh

    nh sau: Keyi = H(key || ki), trong : H() l hm bm mt chiu; key : l kha dng

    chung; ki l t kha tng ng vi KDSi. trnh trng hp server c th loi b

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 31/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    mt entry trong KDS, mi KDS cn c b sung thm mt gi tr dng kim tra,

    gi tr ny c tnh ton nh sau: Hcheck = H(d4||H(d3||H(d2||H(d1||0)))), vi gi thit

    KDS = {d4, d3, d2, d1}.

    Mt phng thc lu tr cc KDS l di dng ma trn, ct l cc nh danh ca

    tt c ti liu, dng l tt c cc t kha, mi cell trong ma trn s nhn gi tr 0 hay 1

    ty vo ti liu c cha t kha hay khng. Ma trn ny, gi l ma trn C, s qua

    mt php bin i tr thnh C nh sau:

    Ci,j = last_bit(F(ki , Rj , Cij))

    trong : F l hm bitwise pseudo-random function, Rj l mt s ngu nhin pht sinh

    bi hm sinh s ngu nhin G vi mt random seed R c nh.

    Cu truy vn query = {k1, k2,, kq} c thc hin bng cch gi yu cu n server

    ly v cc hng tng ng vi cc t kha. Querier ln lt pht sinh li cc gi tr

    Rj , v thc hin tnh li ma trn Cij nh sau: nu last_bit(F(ki , Rj ,0)) = Cij th Cij = 0,

    ngc li Cij = 1. Sau khi tnh li c ma trn C, querier hon ton c th xc nh

    chnh xc c danh sch cc nh danh ti liu cn tm. T cc nh danh, querier c

    th yu cu server tr v ng cc ti liu yu cu.

    Do querier bit c chnh xc danh sch cc nh danh ti liu cn ly, nn, mt

    cch hon ton t nhin, c th kim tra c tnh y ca kt qu tr v. m

    bo tnh privacy, c th s dng cc PIR-like protocol.

    2.6 Bo m truy vn cho d liu dng cy Mt hng tip cn tip theo da trn k thut ca Lin v Candan [2] trong vic m

    bo tnh ring t . [16] m rng phng php ny cho php m bo ba yu cu ca

    query assurance, vi ni dung c bn nh sau.

    - m bo tnh ng, mi record u c ch k cho ring n (RSA). Trong

    trng hp cn so snh tnh ng ca nhiu record, c th s dng m hnh

    Condensed RSA.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 32/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - Tnh c m bo mt cch t nhin, do querier yu cu mt s lng

    nht nh node. Khi , cn c vo s lng node yu cu v s lng node do

    server tr v, hon ton c th tha mn c yu cu ny. N goi ra, m

    bo y chnh l nhng node yu cu, trong d liu ca m ha ca mi node,

    ta lu tr nh danh (nodeID) ca chnh node . Do tp cc nh danh node

    yu cu l bit c, so snh vi cc nh danh ca cc node do server tr v

    c th chng minh c y thc s l cc node mong mun.

    - Tnh mi c tha mn bng cch bng cch b sung vo mt s thng tin

    nh sau:

    o Mi node cha thm mt gi tr thi gian (timestamp) cho bit thi gian cp nht ca node ny.

    o Node cha lu ton b gi tr thi gian (timestamp) ca cc node con.

    o Gi tr timestamp ca node gc l ph bin cho tt c cc querier.

    N h vy, khi qu trnh duyt i t node gc xung cc node con, vi gi tr

    timestamp bit trc, querier hon ton c th chng thc ni dung ca

    node gc l mi. Cn c gi tr timestamp ca cc node con c lu tr ti

    node gc, querier cng xc nh c tnh mi ca cc node con ny. V c

    nh vy, lan truyn ti node l c th chng minh tnh mi ca node ny.

    Nhn xt

    Phng php tip cn ca [16] va m bo c tnh ring t (user privacy v data

    privacy) do tn dng phng thc redundancy data access v node swapping ca Lin

    v Candan [2, 4], v c th m bo c query assurance. Tuy nhin hng tip cn

    ny ch ph hp cho cc cu trc d liu dng cy tm kim (search tree) nh B-Tree,

    B+Tree, R-Tree,, cc cu trc ny c dng ph bin s dng lm ch mc

    (index) cho cc d liu khc.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 33/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    2.7 Nhn xt N goi tr trng hp ng dng cho bi ton c th c trnh by mc 2.5, cc

    phng php khc, v mt bn cht, u p dng ch k in t hay ng dng tnh

    cht khng th o trong khong thi gian tuyn tnh ca cc hm bm (secure hash)

    chng minh tnh ng cng nh tnh y ca kt qu truy vn.

    Hng tip cn ca Maithili, G.Tsudik [10, 21] v Prem Devanbu [14] cho php

    chng minh c chnh xc kt qu tr v l ng v y . Tuy nhin, hin ti, cc

    giao thc ny vn ch c th gii quyt cho cc cu truy vn ch c n gin khng

    c cc hm bao gp (nh SUM, AVERAGE,). Mt khuyt im ca phng php

    ny l ph thuc vo dng thc ca cu truy vn. i hi phi phn tch cu truy vn

    thnh tng phn ring l c nhng tc v thch hp.

    Hng tip cn ca Radu [8] c th p dng cho tt c cc loi truy vn, k c vic s

    dng cc hm gp m [10, 14, 21] cha gii quyt c. u im chnh ca phng

    php ny khng cn phn tch c php ca cc cu truy vn. T , c th trin khai

    d dng hn. Tuy nhin, hng tip cn ny vn cn mt s iu cn xem xt nh

    sau.

    - Ch p dng cho tp cc cu truy vn, cha gii quyt cho trng hp thc thi

    tng cu truy vn ring l, vn c s dng kh nhiu trong thc t. gii

    quyt vn ny c th s dng cc hng nh sau: (1) s dng cc fake-

    query km theo bin cu truy vn n thnh tp cc cu truy vn. Tuy

    nhin, cch ny c th lm qu ti server, gim hiu nng ca ton h thng do

    phi thc hin cc fake-query qu nhiu so vi cc truy vn thc s. (2) cc

    cu query ring l c tp trung li ti mt trust-server v gi n server di

    dng tp cc cu truy vn theo ng tinh thn ca gii php. Phng thc ny

    hu nh khng kh thi do thi gian tr ca cu query trong thi gian ch i l

    khng th chp nhn c. (3) l kt hp ca (1) v (2).

    - Cha chng minh trit kt qu tr v l y . Xc sut server khng

    thc thi hoc thc thi khng hon chnh i vi cu query cui cng l 33%

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 34/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    [8]. y l mt xc sut kh cao. iu ny phn no lm gim bt tnh tin cy

    ca gii php.

    Cc hng tip cn trn u c thc hin cho cc d liu dng quan h (relational

    database). Do , c th p dng c trong CSDL XML cn phi c mt s thay

    i nht nh.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 35/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 3

    D LIU XML

    XML l mt dng d liu bn cu trc (semistructured data), dng cy (tree-

    structured). n v lu tr thng tin ca XML l cc node v attribute. Cc node v

    attribute phn bit thng qua tn v tm vc ca chng (chiu su ca cy, node cha).

    D liu XML l dng vn bn c c. V vy, trc khi tin hnh outsource, ta cn

    phi xc nh c cu trc lu tr ph hp vi cu trc d liu ca XML. iu ny

    c bit quan trng, n nh hng rt ln n cc phng php s c p dng trong

    x l truy vn v m bo truy vn.

    3.1 M hnh lu tr Tng t nh RDB truyn thng, mi ti liu XML u c c trng bi mt lc

    (schema) nh ngha mi quan h cha con gia cc node, s lng thuc tnh ca

    ca mi node. V d nhin, lc ny c dng cy, gi l schema tree.

    Mi node trong ti liu XML (xml element) tng ng vi mt t-node trong cy lun

    l, mi thuc tnh ca xml element s tng ng vi a-node trong cy lun l. Hnh 6

    v d v cy d liu lun l v cy cu trc rt ra t mt ti liu XML.

    T cy cu trc, c th d dng chuyn i ti liu XML sang cc dng lu tr khc.

    Ti liu ny trnh by hai phng php thng dng lu tr ti liu XML: dng

    bng (table-based) v dng node (node-based).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 36/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Root

    Customer Customer

    Order Order

    Code, 01 Amount, 5000 Code, 02 Amount, 10000

    Name, Bob Name, Alice

    t-node t-node

    t-node

    t-nodet-node a-node a-node

    a-nodea-nodea-nodea-node

    (A)

    (B)

    Root

    Customer

    Order

    Cy cu trc ca ti liu XML

    Cy d liu lun l

    Name

    Code Amount

    a-node

    t-node

    Hnh 3.6. Cu trc cy lun l ca mt ti liu XML.

    Trong cy d liu lun l v cy cu trc, mi node hnh ch nht gc trn i din cho mt node (element) trong ti liu XML. Cc node hnh ch vung gc i din cho mt thuc tnh (attribute) trong mt element ca ti liu XML.

    3.1.1 Dng bng: Table-based T cy cu trc ca ti liu, c th chuyn i sang dng lc quan h theo cc

    bc sau.

    - Gn nhn (labeling) cc t-node cu trc sao cho mi node c mt gi tr nhn

    duy nht.

    - Mi t-node cu trc c chuyn thnh mt bng tng ng c tn l tn ca t-

    node kt hp vi gi tr nhn. Cc a-node con ca t-node ny c chuyn

    thnh cc ct ca bng. Mi bng b sung thm ct nodeid l nh danh ca

    node trong bng d liu. N u t-node c cha, th b sung thm ct pnodeid tham

    chiu n bng pht sinh t t-node cha.

    T cy cu trc hnh 3.6.b, ta thc hin gn nhn cho cc node:

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 37/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Hnh 3.7. Cy cu trc sau khi c gn nhn.

    Sau , chuyn sang lc quan h nh sau.

    Root_01(nodeid)

    Customer_02(nodeid, name, pnodeid)

    Order_03(nodeid, code, amount, pnodeid)

    N goi ra, mi table cn c b sung thm mt s ct nh timestamp, sign, cc gi

    tr ny c dng gip vic chng minh kt qu truy vn tr v sau ny.

    u im Ti liu XML sau khi c chuyn i sang dng lc quan h t c

    th p dng cc kt qu trc dng trong lc quan h. Ta c th p

    dng bin php DSAC [10, 21] hay EMB Tree [15] c th m bo querry

    assurance trong vic truy vn.

    Khuyt im V pha ngi dng, CSDL c outsourced l ti liu XML, v vy cu truy

    vn c thc hin thng thng l mt dng query trn ti liu XML (XPath,

    XQuery,). Do , cn phi c mt bc chuyn i t ngn ng truy vn

    ny sang ngn ng SQL bnh thng.

    Mt vn cn c quan tm l: bn cht schema ca CSDL c th c thay

    i ng bt c lc no. Mc khc, vic thay i schema ca ti liu XML l

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 38/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    kh linh ng. Tuy nhin iu ny dn vic thay i cu trc bng RDB tng

    ng. iu ny c tc ng khng tt n d liu c lu tr (vic thm ct

    d liu v mt bng c th dn n vic tnh ton li ton b cc ch k in

    t, nu s dng phng php DSAC). M ha li ton b d liu. iu ny l

    khng th trong trng hp d liu c outsourced.

    Tuy cn tn ti mt s khuyt im, nhng trong trng CSDL XML khng thay i

    v schema th vn c th p dng phng php ny c th tn dng c cc kt

    qu c nghin cu tt trn CSDL quan h. Trong ti liu ny, chng ti cp

    n mt hng tip cn khc da trn phng php lu tr d liu th hai: node-

    based.

    3.1.2 Dng node: Node-based Mt hng tip cn khc trong vic lu tr CSDL XML l lu cc t-node v a-node

    ca cy d liu lun l.

    Tng t nh phng php trn, u tin, cc node cu trc (bao gm c t-node v a-

    node) u phi c gn nhn. Phng php gn nhn tng t nh trn (ch khc l

    vic gn nhn bao gm c a-node). Khi , vic lu xung CSDL quan h s tn ti

    hai bng d liu lu t-node v a-node c ni dung nh sau:

    t-node(nodeid, xtype, datatype, nameid, pnodeid, lmaid, value) a-node(nodeid, xtype, datatype, nameid, pnodeid, sibid, value)

    Trong 2:

    - NodeID : l nh danh ca node.

    - XType : dng phn bit cc loi i tng.

    - Datatype : dng xc loi d liu

    2 N goi cc thnh phn nh trn, ty theo cc gii thut chng thc khc nhau m cn b sung thm mt s cc thng tin khc vo t-node v a-node.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 39/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - NameID : nh danh ca tn ca node (t-node v a-node) . Tn ca mt node

    c phn bit da vo ng cnh m tn xut hin. Mi tn s c nh

    danh bi mt ch s duy nht trong ton CSDL.

    - PNodeID: nh danh ca t-node cha ca t-node hin ti. Ch : i vi a-node,

    pnodeid l nh danh ca t-node cha ca t-node cha ca a-node hin ti.

    - LMAid: nh danh ca a-node tri nht.

    - SibID: nh danh ca a-node anh em bn phi.

    Vi dng thc lu tr nh vy, vic thay i schema (b sung/b bt mt thuc tnh)

    ch n thun nh mt tc v insert/delete n gin, v ch nh hng n node hin

    ti. Do , khng i hi thm bt k mt chi ph no khc m vn m bo c cc

    yu cu bo mt t ra.

    1. u im

    Phn nh ng bn cht dng cy ca ti liu XML. Do , khc phc c

    khuyt im ca phng php table-based, vic thay i cu trc ca ti liu

    XML khng nh hng nhiu n ni dung lu tr hin ti. V ch nh hng

    n node cn cp nht.

    S dng c mt s kt qu nghin cu trc [2, 16] trong vic bo m

    cc vn ca query assurance.

    2. Khuyt im

    Do tt c cc t-node, a-node c lu thnh nhng record ring l nn s

    lng record c th tr nn rt ln so vi table-based. iu ny lm tng tnh

    phc tp ca database.

    N goi ra, cc bin php ch mc (indexing) trn RDB p dng khng my hiu

    qu i vi d liu dng cy nh XML. Trong khi, cc phng php ch mc

    chuyn cho XML vn cn trong giai on pht trin.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 40/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    3.1.3 Nhn xt Trong hai phng php lu tr cp nh trn, phng php table-based chuyn

    i ti liu XML sang dng table ca RDB truyn thng. T c th p dng li

    c cc bin php bo m query assurance [10, 11, 14, 15, 21]. N hng phng

    php ny c mt khuyt im kh ln l: khi cu trc ca ti liu XML thay i bng

    cch b sung mi mt node mi hon ton (hay mt attribute mi) th cu trc ca

    cc bng d liu s b thay i theo. iu ny i hi mt khi lng tnh ton kh

    ln bm bo cc cu trc c dng trong m bo truy vn (bao gm vic k li

    cc record, m ha d liu, xy dng li cc chui ch k hoc cc cu trc index

    phc tp khc,)

    Trong phm vi ca ti liu ny, chng ti s p dng phng php lu tr node-

    based, ng thi ngh mt cu trc ch mc (indexing structure) chuyn dng cho

    ti liu XML. T , nhng km mt s thng tin chng minh tnh ng, tnh y

    v tnh mi.

    3.2 Ch mc cho ti liu XML Ch mc l mt khi nim ht sc quan trng trong CSDL. N gip tng tc ng k

    hiu sut truy vn d liu so vi phng php tm kim tun t c in, trong trng

    hp l tng, tm kim s dng ch mc nhanh hn tm kim tun t l N/log2N ln.

    i vi CSDL quan h (relational databases), ch mc c p dng ht sc c hiu

    qu v ph bin trong hu ht cc RDBMS. Cc cu trc ch mc thng dng l :

    bng bm (hash table), bitmap v cc cu trc ch mc dng cy.

    i vi CSDL bn cu trc dng cy nh CSDL XML, hin ti c nhiu nguyn cu

    trong vic xy dng ch mc ph hp[17, 18]. Trong phm vi ca mnh, ti liu ny

    khng i chi tit vo cc phng php ch mc cho ti liu XML v cng khng c

    nh so snh chng, m ch a ra mt cu trc ch mc cho ti liu XML, m qua

    c th nhng vo mt s thng tin nhm phc v cho mc tiu m bo query

    assurance.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 41/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Trong cc phng php ch mc, phng php ch mc dng cy c s dng kh

    ph bin. Trong in hnh l B+Tree c p dng rt thnh cng trong vic to

    ch mc trn cc RDBMS hin ti. Vi c tnh ca mnh, B+Tree c th to ch mc

    cho mt s lng ln cc record m phc tp khng cao (nu cy B+Tree c

    fanout l 100, chiu cao l 4 th c th qun l c 100x100x100 = 1.000.000

    record).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 42/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 4

    M BO TRUY VN

    4.1 Phng php m bo truy vn (Query assurance) nhm mc tiu chng minh vi ngi dng kt

    qu truy vn tr t server l: ng, , mi. Tnh ng c th thc hin kh d dng

    thng qua ch k in t. Chng minh tnh ca kt qu truy vn thng da vo

    tnh cht ca tng loi query.

    Xt hai loi truy vn: truy vn theo vng (khong gi tr tha mn) v truy vn theo

    im (bng mt gi tr c th). Truy vn im thc cht l dng suy bin ca truy vn

    vng vi hai cn tin n bng nhau. N h vy ch cn xem xt i vi truy vn vng.

    Xt mt truy vn theo vng i vi cn l LB v UB. Kt qu tr v l cc record

    tha mn.

    S = { R | R LB, R UB }

    N u cc record R c m bo l sp xp tng dn (hoc gim dn), chng minh

    kt qu tr v l y , server ch cn tr v hai record nm hai bin :

    S = S {RL | RL = max(Ri), Ri < LB, i} {RU | RU = min(Rj), Rj > UB, j} N u server c th chng minh c gia RL v RU ch c cc record tr v l c th

    chng minh c kt qu l hon ton y .

    N h vy, cc record cn c sp xp theo th t v th t ny c th chng minh

    c. Mt cch n gin t c iu ny l sau khi sp xp R theo th t, ta tnh

    gi tr bm nh sau:

    S = h(h(R0) | h(R1) | h(R2) |.| h(RN))SK

    Trong : h l hm bm bo mt (security hash) nh SHA1, MD5.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 43/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Sau thc hin k ln gi tr bm va tnh c bng gii thut m ha bt i xng

    (nh RSA). Trong kt qu tr v, server tr v km theo cc h(Ri) cn li v gi tr S.

    Client hon ton c th tnh li bm ca cc record theo quy tc trn v thc hin

    kim chng vi S bng kha cng cng ca gii thut k.

    4.2 Nested B+ Tree N h trnh by phn trn, phng php p dng chng minh tnh y cn

    bn da trn dy th t cc record v ch k ln dy th t ny. iu ny c th t

    c bng mt cu trc ch mc ph hp vi cch thc lu tr d liu c trnh

    by mc 3.1 ca ti liu ny.

    Xt mt ti liu XML, cc node c nh v bi path, tc ng i t node gc n

    node hin ti. Cc truy vn trn XML thng thng c xc nh path. N h vy, ch

    mc XML, ngoi gi tr ca node, cn phi cha thm thng tin v path ca node.

    Quay li phng php lu tr c cp phn trn, gi tr nameid l duy nht

    i vi mi node cu trc, do c th c s dng tng ng vi path. N h vy,

    mi node cn c ch mc trn b hai thuc tnh (nameid, value).

    Tuy nhin, ngoi vic truy vn theo gi tr, vi bn cht cha/con ca d liu dng cy

    XML th yu cu truy vn cc node con khi bit c node cha l thng xuyn.

    chng minh completeness cho cc truy vn ny, cn b sung thm ch mc ca b

    ba gi tr (nameid, pnodeid, value) cc record c sp xp theo node cha.

    Ta c th xy dng hai cy ch mc ring l cho hai b gi tr trn. Tuy nhin, iu

    ny c th dn n mt s vn phc tp trong vic cp nht d liu, chng minh

    truy vn v lng ph ni lu tr (do c trng thuc tnh u l nameid).

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 44/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    ... ...

    NameTree

    ParentTree ValueTree

    (nam

    eid)

    (pno

    deid,

    value

    )

    (value

    )

    Hnh 4.8. Cu trc NB+Tree.

    S kt hp ca ba loi cy NameTree, ParentTree v ValueTree cho php sp th t ton b cc attribute v cc element ca ti liu XML theo hai th t: (nameid, value) v (nameid, pnodeid, value), m bo cho vic truy vn nhanh chng cng nh kh nng m bo truy vn trn ti liu XML.

    m bo c yu cu trn, c th s dng kt hp cc cu trc cy nh sau. Xy

    dng mt cy B+-Tree vi kha so snh l nameid, gi l NameTree. Ti node l ca

    NameTree cha gc ca hai cy B+Tree theo kha ln lt l (pnodeid, value) v

    (value). Hai cy ny c tn l: ParentTree v ValueTree. Tp hp ba loi cy ny to

    thnh mt cu trc d liu, gi l Nested B+Tree, cho php lp ch mc cho ti liu

    XML trn hai b gi tr (nameid, pnodeid, value) v (nameid, value).

    N goi ra, vic phn b ca d liu cng nh hng mt phn kh ln n cu trc

    cy. N u d liu b tp trung vo mt vng nht nh c th lm qu trnh tch node

    din ra thng xuyn, lm cho hiu sut s dng ca cy B+ khng cao. hn ch

    iu ny, cy B+ a vo thao tc ti phn b (redistribute) d liu cc node gc.

    Trong cy N B+, ngoi c tnh trn, cc kha (key) cng c thc hin ti phn b

    trc khi thc hin vic tch node hay hp nht (merge) node.

    Mt hng tip cn khc, l s dng mt cu trc ch mc a chiu (multi-dimension

    index) to ch mc cho ti liu XML da trn b bn thuc tnh (nameid, prefixid,

    pnodeid, value) nh h cy R (R-Tree, R+Tree, R* Tree, X-Tree,). Tuy nhin, cc

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 45/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    cu trc ch mc ny khng m bo c th t tng dn ca cc record (hoc ch

    m bo cho mt thuc tnh) v vy khng ph hp cho vic chng minh

    completeness trong truy vn.

    Nested Merkle B+ Tree

    N M+Tree c th chng minh kt qu truy vn, cn nhng km mt s thng tin

    tng t nh cy Merkle Hash Tree (MHT) vo cy N B+Tree nh sau: gi H(node) l

    bm ca mt node bt k thuc N B+Tree, khi h(node) c tnh nh sau:

    - Node l node l ca ValueTree, ParentTree: H(node) = h(A(node))

    - Node l node l ca NameTree: H(node)=h(H(R(ParentTree))||H(R(PrefixTree)))

    - Node l node ni : H(node)= h(I H(Childi(node))) , i = 1..s con ca node. - Node l gc ca NameTree: H(node)= h(||I H(Childi(node))) , i = 1..s con ca

    node.

    Trong : h l hm bm bo mt tha tnh khng ng v khng th o trong thi

    gian tuyn tnh (collusion-free and non-invertable) nh SHA1, MD5. A(node) l t-

    node hoc a-node lin kt vi node l ny. R(tree) l node gc ca tree. Childi (node)

    l node con th i ca node. l mt gi tr duy nht theo thi gian (timestamp), gi tr no c cung cp ti tt c cc client khi c s thay i. Sau , thc hin k ln ni

    dung ca R(PrefixTree) bng gii thut ch k s.

    N h vy, vi vic p dng tng ca MHT ln N B+Tree, ta c N MB+Tree

    (Nested Merkle B+ Tree), va dng lm ch mc trong truy vn va c dng

    chng minh truy vn trn ton d liu XML.

    Phn tip theo ca ti liu trnh by cch thc vn dng N MB+Tree trong vic thc

    hin truy vn cng nh chng minh kt qu tr v.

    4.3 Tc v chn Php chn (Selection) l cu truy vn c x dng thng xuyn nht. i vi RDB,

    l pht biu SELECT trong SQL. Trong XML, c nhiu ngn ng c dng truy

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 46/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    vn nh XPath, XQuery. Ti liu ny minh ha cho mt vi dng cu truy vn XPath

    thng dng.

    Xt mt ti liu XML c cu trc nh sau:

    Hnh 4.9. Cy cu trc ca mt ti liu XML

    Cc ch s nh k bn cc node l nameid ca node . Ti liu s kho st qua mt

    s dng truy vn thng gp.

    4.3.1 Dng truy vn lin quan mt node, iu kin n L dng truy vn n gin nht, v d /Customer/Order/Item[@name=TV]: tm tt

    c cc hng mc c tn l TV. Theo quy tc gn tn, ta c nh danh tn ca thuc

    tnh name ca node Item l 12. N h vy, tr li cu truy vn ny, thc hin tm trn

    N MB+Tree vi kha tm kim (nameid=13, value=TV). Kt qu tr v cc a-node

    Name_13 ca Item_8. T cc a-node, c th d dng xc nh c cc t-node Item_8

    tng ng. Sau , ta c th ly c cc a-node khc (nh price_14, qty_15).

    Tnh ng v Tnh y

    N goi cc a-node Name tha iu kin, server tr v thm hai a-node bin ca tp kt

    qu, v cc bm ca cc node ln cn tnh bm ca node cha, ng thi tr

    thm gi tr bm ca cc node anh em ca node cha. Cc kt qu ny gip cho client

    c th tnh ngc li c gi tr bm ca nt gc.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 47/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Ti nt gc, t gi tr bm tnh c kt hp vi temp thi gian lu tr, client c th

    kim tra vi ch k ca nt gc. Do c tnh mt chiu, bt-kh-o ca cc hm bm

    c p dng nn c th chng minh tnh ng (correctness).

    Do cc record c sp xp theo th t tng dn theo b thuc tnh (nameid, value),

    kt hp vi hai record bin km theo, c th chng minh c tnh y

    (completeness) ca d liu.

    Tnh mi

    Do trong bm ca node gc c cha ng gi tr temp thi gian (timestamp). N u data owner ph bin gi tr ny cho tt c cc client. T client c th xc nh c tnh mi ca d liu.

    N h vy, ch sau mt thao tc kim tra, client c th kim tra y c ba vn t

    ra ca query assurance chi thng qua cc php bm vi chi ph thp hn rt nhiu vi

    cc php ton s nguyn ln ca cc ch k s.

    Chng minh khng c record tha mn (empty proof)

    Mt phng din khc cn c quan tm l chng minh kt qu tr v l trng,

    ngha l khng c record no tha iu kin tm kim. chng minh, server tr v

    hai record nm k nhau trong dy th t c gi tr ln v nh hn gi tr cn truy vn.

    4.3.2 Dng lin quan nhiu node, iu kin n L dng truy vn tng i phc tp hn lin quan n nhiu node. iu ny tng

    ng vi php join query ca RDB.

    Xt cu truy vn : /Customer[@name=Marry]/Order/Item : tr v tt c cc hng

    mc do khch hng c tn l Mary mua. Tng t nh trn, xy dng mt VO

    chng minh cho truy vn (nameid=4, prefixid=1, value=Marry). Tuy nhin, VO

    ny ch cn cha cc a-node Name_4 v cc t-node Customer_1 tng ng. Cc a-

    node Name_4 c dng chng minh t_node Customer_1 y .

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 48/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    ng vi mi t-node Customer_1 xy dng mt VO m bo query assurance ca

    truy vn (nameid=3, prefixid=1, pnodeid=)

    ly ra kt qu cho t-node Order_3.

    Tng t, ta c th m bo cho kt qu truy vn cho cc node Item tr v.

    Tng t cho cu truy vn /Customer[@name=Marry]/Order/Item[@name=

    TV] : ly cc hng mc c tn l TV do khch hng tn Marry mua. Phng

    php x l kt hp gia cu truy vn trn v x l ca trng hp u tin.

    4.3.3 Cc dng khc Hai v d trn l cch thc vn dng cy N MB+ trong vic thc hin truy vn v

    chng minh kt qu truy vn. x l cc cu truy vn khc phc tp hn, ta c th

    phn tch chng thnh nhng cu truy vn n v t xy dng mt k hoch thc

    thi (execution plan) vi cc bc thc hin n. V d, ta c execution plan cho cc

    cu truy vn trn nh sau.

    /Customer/Order/Item[@name=TV] STEP#1 IndexMethod : Vtree, nameID=13 Condition : equal to [TV] Result level : not included Retrieval : node only StepValue : PNODEID [For each matched items, perform] STEP#2 IndexMethod : DirectIDAccess, id=ParentStepValue Result level : 1 Retrieval : node and all its attributes /Customer[@name=Marry]/Order/Item STEP#1 IndexMethod : Vtree, nameID=4 Condition : equal to [Marry] Result level : not included Retrieval : node only StepValue : PNODEID [For each matched items, perform] STEP#2 IndexMethod : DirectIDAccess, value=ParentStepValue Result level : not included Retrieval : node only StepValue : ID [For each matched items, perform] STEP#3 IndexMethod : Ptree, nameID=3, pid=ParentStepValue Result level : not included Retrieval : node only StepValue : ID [For each matched items, perform] STEP#4 IndexMethod : Ptree, nameID=8, pid=ParentStepValue Result level : 1 Retrieval : node and all its attributes

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 49/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    4.3.4 Hp nht cc VO Trong vic thc thi mt cu truy vn, hu ht cc trng hp, server cn tr v cho

    client nhiu hn mt VO. V d, trong trng hp cu truy vn mc 4.2.2, s lng

    VO server cn tr v phc thuc vo s lng record Customer tha iu kin

    @name=Marry. iu ny c th dn n s qu ti pha Client trong vic kim

    tra cc VO.

    hn ch trng hp trn, server cn phi thc hin hp nht cc VO ny tr thnh

    mt VO duy nht v gi tr VO ny v cho client. N h vy, client ch cn thc hin

    kim tra mt ln duy nht c th xc nh query assurance cho ton b d liu

    nhn c.

    Vic hp nht cc VO c th thc hin c d dng. Mi VO ban u c xc nh

    bi mt on cc record lin tc nhau cng vi (ti a) hai record bin. Vi nhiu

    VO, ta s c nhiu on. V vy, thay v pht sinh co-path cho tng on, ta c th

    thc hin pht sinh co-path cho ton b cc on.

    4.4 Cc tc v cp nht d liu Cc tc v cp nht khi d liu c outsourced i hi thm mt s chi ph nht

    nh. Trong trng hp mt bn sao ca CSDL c lu tr data owner, s lng

    cc ln cp nht xy ra khng thng xuyn (thi gian gia hai ln cp nht l ln)

    v s lng cp nht l nhiu, data owner c th thc hin cp nht d liu cc b.

    Sau thc hin outsourced li d liu.

    y, ti liu trnh by mt phng thc cp nht d liu trc tip c dng

    trong trng hp data owner khng lu tr li bn sao ca d liu outsourced hoc

    chi ph cho mi ln outsource l kh ln.

    4.4.1 Tc v thm mi (insertion) Data owner gi cc d liu cn thm mi (cc xml element, attrbiute) n server.

    Server thc hin thm cc record mi vo CSDL, cp nht li cu trc index. Tnh

    ton v cp nht cc gi tr bm ca cc node l c lin quan v lan truyn ln n

    node gc ca cy N MB+. Server gi tr v cho data owner gi tr bm mi ca node

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 50/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    gc. Data owner pht sinh mt gi tr timestamp mi, kt hp vi gi tr bm nhn

    c. Sau thc hin k ln cc gi tr ny. Data owner gi li cho server gi tr

    timestamp mi cng vi ch k mi. Server cp nht gi tr timestamp v ch k va

    nhn c vo CSDL.

    Giao thc cp nht nhiu giai on nh trn c minh ha nh hnh sau.

    Hnh 4.10. Cc bc thc hin insert

    4.4.2 Tc v xa v cp nht (deletion/updation) Tc v xa/cp nht d liu cng c thc hin thng qua cc bc tng t nh tc

    v thm mi. Data owner gi yu cu n server. Server thc hin vic cp nht

    xung CSDL ng thi tnh ton li cc gi tr bm ca cc node trong cy N MB+.

    Gi tr bm mi ca node gc c tr v cho data owner thc hin k. Ch k

    cng gi tr timestamp mi c gi tr v cho server cp nht vo CSDL.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 51/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 5

    PHN TCH

    Chng ti trnh by gii php lu tr v s dng cy ch mc N MB+ chng

    minh truy vn i vi d liu XML c outsourced. Trong phn ny, chng ti

    cp n cc kha cnh bo mt cng nh chi ph ca gii php.

    Tt c d liu, k c cc node ca cy ch mc, trc khi lu tr u c m ha

    bng gii thut m ha i xng (nh Rijndael) nhm m bo tnh bo mt ca d

    liu cng nh hiu sut khi thc hin truy vn. N h vy, phng php ny tha mn

    c yu cu data confidentiality.

    Gii php trn khng cp n tnh ring t. Tuy nhin, do c tnh ca d liu dng

    cy, nn hon ton c th p dng cc kt qu ca Lin v Candan [2] c th t

    c tnh ring t.

    i vi query assurance, ti liu trnh by phng thc gii quyt i vi cc cu

    truy vn vng v im ca mt iu kin n. i vi cu truy vn vng vi nhiu

    iu kin kt hp, ta c th gii quyt tng t nh cch thc x l cc tc v tp hp

    (set operations) bao gm php ton giao v hi nh trnh by trong [10, 14, 21]

    - Php Hi : c thc hin n gin nh hai cu truy vn ring bit. Cc VO pht

    sinh t hai cu truy vn s c hp nht nh trnh by.

    - Php Giao : mt gii php n gin, vi cc tp kt qu t hai cy truy vn con

    theo tng iu kin, server chn mt tp kt qu nh hn, ng vi mi kt qu

    thuc tp ny khng tha mn tp kia, server tr v mt empty proof. Tt c cc

    VO sau cng s c hp nht thc hin kim tra mt ln pha client.

    Mt iu cn quan tm trong query assurance l chng minh cho php join. Trong d

    liu dng cy nh XML, php join c p dng ch yu cho quan h t node cha

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 52/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    xung node con. y l mi quan h ph bin trong CSDL XML. Cy N MB+ s

    dng PTree chng minh truy vn cho php join ny vi chi ph thp nht trong vic

    xy dng VO cng nh chi ph kim tra ti pha client.

    Gii php ny khng p dng trc tip cho cc cu truy vn c s dng cc hm tnh

    ton bao gp (aggregated function). Cho ti thi im hin ti, theo hiu bit ca

    chng ti, ch c gii php ca Radu Sion. [8] gii quyt cho trng hp ny, tuy

    nhin gii php cng tn ti mt s hn ch nh trnh by trn. i vi cc cu

    truy vn ny, ta c th chng minh cc cu truy vn ny tng t nh cu truy vn

    vng, vi iu kin l iu kin lc ca cu truy vn ban u. Kt qu ca hm bao

    gp c th c tnh ton ti server v c client kim tra li ngu nhin sau khi

    chng thc c query assurance cho cc record tha mn iu kin. Hoc hon ton

    c tnh ton ti client.

    Tip theo, ti liu tin hnh phn tch v mt chi ph ca gii php. Do hin ti, cha

    c mt nghin cu trong lnh vc v query assurance cho CSDL XML, nn cc phn

    tch chi ph c nh gi da trn chi ph ti a v ti thiu theo tnh ton l thuyt.

    ng thi, y, chng ti ch cp n cc chi ph pht sinh nhm m bo query

    assurance.

    Trc tin, ti liu trnh by mt s k hiu quy c c s dng bao gm:

    n Tng s phn t (s element v s attribute).

    s Tng s phn t tr v ca mt truy vn.

    f Tham s fanout ca cy N MB+.

    h Chiu cao ca cy.

    L S node l ca cy.

    N Tng s node ca cy.

    |sign| Kch thc gi tr bm (20 byte cho SHA-1)

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 53/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chi ph lu tr ti server (storage cost): chi ph pht sinh cho vic lu tr ti server

    l chi ph dng lu cy N MB+. Cy N MB+ l kt hp bi mt cy NameTree v

    cc cy ValueTree, ParentTree cc l ca cy NameTree. Kch thc ca cy

    NameTree ph thuc vo s lng node trong cy cu trc (shema tree) ca ti liu

    XML. S lng node ny thng l rt nh so vi s lng node ca cy d liu

    (data tree) XML, ng thi t bin ng trong thi gian sng ca d liu. Do , chi

    ph lu tr ch yu l chi ph lu cc cy ParentTree v ValueTree.

    Do s lng ValueTree v ParentTree bin i ty theo s phn t ca schema tree,

    v chiu cao ca cc cy l bin i ph thuc v s phn t d liu tng ng vi

    mi phn t cu trc. N h vy, tin cho vic nh gi, ta gi s schema tree ca ti

    liu XML ch c duy nht mt phn t, v do ch c mt cy ValueTree v mt cy

    ParentTree.

    D dng nhn ra rng, tng s phn t d liu cc l ca cy ValueTree v

    ParentTree l ging nhau. Gi s rng cc phn t d liu l phn bit nhau qua

    (nameid, value), ngha l, mi slot ca mt node l bt k ch cha lin kt n mt

    phn t d liu duy nht. Khi ta c:

    =

    =fnL

    fnL VTreeVTree 2, maxmin (5.1)

    Ta c s node l cn thit ti thiu trong trng hp tt c cc node l u y v s

    node l ti a trong trng hp tt cc cc node l ch cha s phn t. T s Lmin v Lmax, c th xc nh c s node ti thiu v ti a ca cy nh sau.

    1log minmin += VTreefVTree Lh (5.2)

    1log max2

    max +

    = VTreefVTree Lh (5.3)

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 54/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    111

    11 1minmin

    min +

    =

    f

    fLNh

    VTree (5.4)

    121

    211

    maxmax

    max

    +

    =

    f

    fLN

    h

    VTree (5.5)

    Cc cng thc (5.2),(5.3),(5.4),(5.5) c th c chng minh nh sau. Gi s, cy B+

    c L node l, khi ta c:

    S node ti thiu ti

    su S node ti a ti su

    h Lmin Lmax h-1 Lmin/f 2Lmax/f h-2 Lmin/f2 22Lmax/f2

    h-(h-1) Lmin/fh-1 = 1 2h-1Lmax/fh-1 = 1

    1log minmin += VTreefVTree Lh 1log max2

    max +

    = VTreefVTree Lh

    N min = Lmin + Lmin/f + Lmin/f2 + + 1 =

    f

    fLh

    11

    11 1min

    min

    + 1

    N max = Lmax + 2Lmax/f + 22Lmax/f2 + + 1 =

    f

    fL

    h

    21

    211

    max

    max

    + 1

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 55/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    N h vy, chi ph lu tr pht sinh c th tnh nh sau:

    +=++=+=++=

    VTreeNTreePTreeVTreeNTreestorage

    VTreeNTreePTreeVTreeNTreestorage

    NNNNNC

    NNNNNC

    maxmaxmaxmax

    minminminmin

    2

    2 (5.6)

    Thc t, tn ti kh nhiu phn t c kha so snh trng nhau, tc l c b gi tr

    (nameid, value) i vi ValueTree hoc (nameid, pnodeid, value) i vi ParentTree

    ging nhau. N hng phn t ny chim cng mt slot trong node l ca cy ch mc.

    Do , tng s slot s dng trong cc node l lun nh hn tng s phn t cn ch

    mc.

    N u gi n, n ln lt l s cc phn t phn bit qua (nameid, value) v (nameid,

    pnodeid, value), d dng nhn ra rng: n < n < n. N h vy, tnh c chi ph

    lu tr, cn xc nh gi tr n v n. Sau thay vo cng thc (5.4), (5.5) xc

    nh chi ph cho ValueTree v ParentTree.

    Kch thc VO: trong phn tch ny, chng ti ch quan tm n phn d liu phi

    thm vo VO c th chng minh truy vn, do , khi nim kch thc VO l kch

    thc thng tin thm vo. N goi kt qu truy vn, server tr v hai gi tr kha bin

    (value cho ValueTree v pnodeid, value cho ParentTree) cng bm ca record tng

    ng ca hai kha trn. Cng vi kch thc ca co-path dng tnh ton bm ca

    node gc. N h trnh by phn trn, mt cu truy vn XPath c th c phn tch

    thnh nhiu on truy vn vng, v vy c nhiu VO cho cc vng ny. Kch thc

    VO tr v l tng kch thc ca cc VO con.

    Xt kch thc VO cho mt truy vn, ta c:

    S node su S phn t b sung vo

    co-path

    h

    +=f

    sLh2 CVOh = f.Lh - s + 2

    h-1 Lh-1 = Lh/f CVOh-1 = f.Lh-1 Lh

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 56/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    h-i Lh-i = Lh-i+1/f CVOh-i = f.Lh-i Lh-i+1 Kch thc VO: CVO =|sign| CVOi , i = 1,hNMBTree (5.7)

    Chi ph CPU: chi ph CPU (CPU time) l tng thi gian x l k t lc server nhn

    c yu cu truy vn cho n khi kt qu truy vn c gii m hon tt pha client

    sau khi loi b thi gian cho vic truyn nhn d liu trn ng truyn.

    C th chia cc giai on x l mt cu truy vn thnh cc giai on sau:

    serv

    er s

    ide

    clie

    nt s

    ide

    Hnh 5.11. Cc bc thc thi query.

    - Parse : phn tch cc thnh phn ca cu truy vn, loi b cc k t i din nu

    cn thit. C th t mt cu truy vn ban u s c phn tch thnh nhiu cu

    truy vn con.

    - Plan : xy dng chin lc thc thi cc cu truy vn va phn tch.

    - Fetch data : thc hin truy vn trn cy ch mc, c cc record tha mn iu

    kin t CSDL.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 57/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    - Build VO : b sung cc thng tin cn thit xy dng cc VO phc v cho vic

    chng minh truy vn, bao gm c thi gian to ra cc co-path cho kt qu truy

    vn.

    - Verify VO : kim tra kt qu truy vn da vo thng tin VO nhn c.

    - Generate XML : phc hi li d liu dng XML. Do d liu XML ban u lu

    xung CSDL chuyn thnh cc t-node v a-node, nn kt qu tr v phi c

    phc hi sang dng XML ban u.

    Hin ti, do gii hn v thi gian, chng trnh nh gi khng thc hin hai bc

    u l Parse v Plan. Cc cu truy vn c dng thc thi c cung cp di

    dng execution plan. Do , cng vic cn li ch bt u t bc fetch data.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 58/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Chng 6

    THC NGHIM

    nh gi thc nghim, chng ti hin thc gii php trn nn tng .N ET

    Framework 2.0, th vin m ha s dng Rijndael v SHA1 cung cp bi .N ET 2.0.

    S dng MS SQL Server 2005 Express Edition lu tr CSDL.

    Chng trnh th nghim trn h thng PC P4 2.8GHz, 512MB. Ti liu XML mu l

    Modial [19] vi 69,846 hng mc (gm 22,423 element 47,423 attribute). Lc

    ca ti liu ny c trnh by phn ph lc. H s fanout ca cy N MB+ l 10.

    Cc tiu ch c o c bao gm:

    - Chi ph lu tr (tng s node ca cy N MB+ theo thc t, so snh vi s node tnh

    trn l thuyt trnh by mc 5).

    - Kch thc VO (s lng cc hng mc tr km v chng minh truy vn).

    - Thi gian thc thi truy vn bao gm cc giai on: fetch data, build VO, verify

    VO, generate XML v thi gian tng cng k t lc cu truy vn bt u c thc

    thi cho n khi d liu XML kt qu c gii m hon tt. Cc thi gian ny

    c so snh trn tng quan s lng record tr v trn tng s record ca

    CSDL thy c tnh tng quan ca thi gian thc thi vi cc thng s khc

    (yu cu l tng quan tuyn tnh vi kt qu tr v v tng quan logarit vi kch

    thc CSDL).

    N goi ra, thi gian ny cn c so snh vi thi gian thc thi cu truy vn trong

    iu kin khng bo mt o c chi ph pht sinh cho vic bo mt ca gii

    php.

  • ti: Security Issues in Querying Dynamic Outsourced XML Databases Trang 59/93

    SV: Nguyn Vit Hng HD: TS ng Trn Khnh

    Cc s liu o c da trn cu truy vn /mondial/country/city[population > 500000].

    K hoch thc thi cho cu truy vn ny c trnh by phn ph lc.

    Chi ph lu tr

    y, chng ti b qua chi ph lu tr cho d liu XML m ch tp trung vo phn

    chi ph pht sinh cn thit lu tr cu trc cy ch mc. Chi ph ny c nh gi

    thng qua s lng node cn thit ca cy thc hin ch mc cho s lng cc

    phn t cn thit.

    nh gi chi ph da trn chi ph tnh ton t cng thc (5.4) v (5.5) cho kch thc

    ti thiu v ti a. T kch thc ca CSDL (database size) (s lng phn t), xc

    nh c s phn t phn bit nhau qua b thuc tnh (nameid, value) v (nameid,

    pnodeid, value) xc nh s lng slot cn lu tr ti cc node l ca cy chi mc

    theo (5.4), (5.5). Kt qu trnh by bng di y.

    Database Size 10K 20K 30K 40K 50K 60K 70K

    Valu

    e Tr

    ee

    Distinct items 5,415 10,743 16,128 20,907 22,279 22,499 24,913Min Leaves 542 1,075 1,613 2,091 2,228 2,250 2,492Min Height 4 5 5 5 5 5 5Min Nodes 603 1,196 1,794 2,325 2,477 2,501 2,770Max Leaves 1,083 2,149 3,226 4,182 4,456 4,500 4,983Max Height 5 6 6 6 6 6 6Max Nodes 679 1,345 2,018 2,615 2,786 2,814 3,116

    Pare

    nt T

    ree

    Distinct items 9,126 18,108 27,181 36,376 43,612 50,379 57,497Min Leaves 913 1,811 2,719 3,638 4,362 5,038 5,750Min Height 4 5 5 5 5 5 5Min Nodes 1,015 2,014 3,022 4,043 4,848 5,599 6,390Max Leaves 1,826 3,622 5,437 7,276 8,723 10,076 11,500Max Height 6 6 6 7 7 7 7Max Nodes 1,143 2,265 3,400 4,549 5,454 6,299 7,189

    Bng 6.2. Kch thc cy ValueTree, ParentTree min, max

    Bng 6.3. Chi ph lu tr

    Data size 10K 20K 30K 40K 50K 60K 70K Min nodes 1,618 3,210 4,816 6,368 7,325 8,100 9,160Ma