152
1 LI CM N u tiên, chúng em xin gi li cm n n Thy, Cô khoa Công ngh Thông tin trng i hc Khoa hc T nhiên ã tn tình dy d, dìu dt chúng em sut bn nm i hc. Chúng em cm n Cô Phm Th Bch Hu, ngi tn tình hng dn, giúp , ng viên chúng em hoàn thành lun vn này. Cui cùng, chúng con cm n Ba, M và nhng ngi thân ã khích l, h tr, ng viên chúng con trong thi gian hc tp, nghiên cu c thành qu nh ngày nay. Tháng 7 nm 2005 Sinh viên Phm Th M Phng – T Th Ngc Thanh

Tìm kiếm ngữ nghĩa ứng dụng trên lĩnh vực edoc

Embed Size (px)

DESCRIPTION

CNTT

Citation preview

  • 1

    LI CM N

    u tin, chng em xin gi li cm n n Thy, C khoa Cng ngh Thng tin

    trng i hc Khoa hc T nhin tn tnh dy d, du dt chng em sut bn nm

    i hc.

    Chng em cm n C Phm Th Bch Hu, ngi tn tnh hng dn, gip ,

    ng vin chng em hon thnh lun vn ny.

    Cui cng, chng con cm n Ba, M v nhng ngi thn khch l, h tr,

    ng vin chng con trong thi gian hc tp, nghin cu c c thnh qu nh

    ngy nay.

    Thng 7 nm 2005

    Sinh vin

    Phm Th M Phng T Th Ngc Thanh

  • 2

    NHN XT CA GIO VIN HNG DN

    ....

    Ngy thngnm 2005

    K tn

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 3 - 0112398 T Th Ngc Thanh

    NHN XT CA GIO VIN PHN BIN

    .

    Ngy thngnm 2005

    K tn

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 4 - 0112398 T Th Ngc Thanh

    MC LC

    M U.................................................................................................................................10 Chng 1 : TNG QUAN.....................................................................................................11

    1.1. "t v#n ................................................................................................................ 11 1.2. Bi ton gii quyt ................................................................................................... 13 1.3. Hng tip cn......................................................................................................... 14

    Chng 2 : C S L THUYT ........................................................................................17 2.1. Chin lc tm kim thng tin c$a cc b tm kim (Search Engine) ..................... 17

    2.1.1. Mt s search engine thng d!ng: ................................................................... 17 2.1.2. Chin lc tm kim ........................................................................................ 32 Nguyn l hot ng........................................................................................................ 34

    2.2. Semantic Web .......................................................................................................... 34 2.2.1. Khi nim......................................................................................................... 34 2.2.2. Kin trc .......................................................................................................... 36 2.2.3. Cc thch thc "t ra cho Semantic web ......................................................... 37 2.2.4. So snh web v web ng ngh a........................................................................ 41 2.2.5. Cc khi nim lin quan................................................................................... 42 2.2.6. Ontology .......................................................................................................... 44 2.2.7. Rdf ................................................................................................................... 46

    2.3. eDoc ......................................................................................................................... 55 2.3.1. Tm hiu eLearning.......................................................................................... 55 2.3.2. Tm hiu eLib................................................................................................... 61 2.3.3. Tm hiu eDoc ................................................................................................. 68

    2.4. Mt s v#n trong x% l ngn ng t nhin: ......................................................... 71 2.4.1. V#n trong vic x% l vn bn:...................................................................... 72 2.4.2. V#n x% l ng ngh a: ................................................................................... 72 2.4.3. Phn loi vn bn (Text Classification)........................................................... 82

    Chng 3 : M HNH V GII THUT ..........................................................................84 3.1. Cng ngh tm kim ng ngh a trn th gii hin nay: ........................................... 84 3.2. Cc bc xy dng mt ng d!ng semantic search engine:.................................... 91

    3.3.1. Xy dng kin trc Web ng ngh a:................................................................ 92 3.3.2. Lp ch& m!c ng ngh a tim tng: ................................................................... 93

    3.3. M hnh ngh cho ng d!ng tm kim ng ngh a trn l nh vc eDoc................. 96 3.4. Cc gii thut s% d!ng ........................................................................................... 100

    3.4.1. Gii thut x% l ti liu: ................................................................................. 100 3.4.2. Gii thut rt trch siu d liu: ..................................................................... 102 3.4.3. Gii thut phn loi l nh vc cho ti liu:...................................................... 104 3.4.4. Gii thut x% l cu truy v#n: ......................................................................... 104

    Chng 4 : CHNG TRNH NG DNG....................................................................105 4.1. Gii thiu chng trnh ng d!ng: ........................................................................ 105 4.2. Kin trc c$a ng d!ng:......................................................................................... 105 4.3. M t phm vi ng d!ng........................................................................................ 107

    4.3.1. M t bi ton: ............................................................................................... 107

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 5 - 0112398 T Th Ngc Thanh

    4.3.2. Xc nh yu cu: .......................................................................................... 107 4.4. Xy dng ng d!ng: .............................................................................................. 108

    4.4.1. Thit k d liu: ............................................................................................. 108 4.4.2. Thit k x% l:................................................................................................ 110

    4.5. Kt qu chng trnh ............................................................................................. 112 4.6. Thc nghim chng trnh .................................................................................... 114

    Chng 5 : KT LUN ......................................................................................................118 5.1. nh gi kt qu nghin cu ................................................................................. 118

    5.1.1. 'u im ......................................................................................................... 118 5.1.2. Khuyt im:.................................................................................................. 119

    5.2. Hng pht trin .................................................................................................... 119 TI LIU THAM KHO...................................................................................................120

    I. Lun vn, lun n:...................................................................................................... 120 II. Sch, eBooks:............................................................................................................. 120 III. Website: ................................................................................................................. 122

    PH LC..............................................................................................................................124 1. C php RDF: ............................................................................................................ 124 2. RDF Gateway: ........................................................................................................... 129

    2.1. Kin trc c$a RDF Gateway:............................................................................. 130 2.2. Tnh nng (Features).......................................................................................... 132

    3. H thng nhn ng ngh a:.......................................................................................... 138 3.1. Nhn ng ngh a c bn cho danh t: ................................................................. 139 3.2. Nhn ng ngh a c bn cho ng t: ................................................................. 141 3.3. Nhn ng ngh a c bn cho tnh t:................................................................... 142 3.4. H thng nhn ng ngh a LDOCE .................................................................... 142

    4. H c s tri thc ng ngh a t vng WordNet .......................................................... 144 4.1. H thng nhn ng ngh a c$a danh t: .............................................................. 144 4.2. H thng nhn ng ngh a c$a ng t: .............................................................. 149

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 6 - 0112398 T Th Ngc Thanh

    DANH MC CC BNG

    Bng 1 : Bng hng dn nhanh v cch s dng mt s search engine ph bin ......... 28 Bng 2: S lc v cc c trng c a mt s search engine thng dng trn Internet .. 32 Bng 3 : Cc lp trong RDF ............................................................................................ 54 Bng 4:Cc thuc tnh c a RDF........................................................................................... 55 Bng 5: Danh sch cc ngh!a v rng buc c a cc t" th#c trong cu............................. 77 Bng 6 M t c s$ d% li&u cho 'ng dng.......................................................................... 110 Bng 7 Cc module c a chng trnh................................................................................ 110 Bng 8 Module eDocSearch ................................................................................................ 111 Bng 9 Module eDocSearch ................................................................................................ 111 Bng 10 Cc cu truy v(n th nghi&m............................................................................... 115 Bng 11 Thng k l!nh v#c khoa h)c my tnh................................................................. 116 Bng 12 Thng k l!nh v#c ngh& thu*t. ............................................................................. 116 Bng 13: Nhn ng% ngh!a c bn cho danh t".................................................................. 140 Bng 14: Nhn ng% ngh!a c bn cho ng t" .................................................................. 142 Bng 15 : Nhn ng% ngh!a c bn cho tnh t"................................................................... 142 Bng 16: H& thng nhn ng% ngh!a LDOCE .................................................................... 144 Bng 17:S# phn lp danh t" trong WordNet.................................................................. 148

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 7 - 0112398 T Th Ngc Thanh

    DANH MC CC HNH

    Hnh 1: Giao di&n c a Google............................................................................................... 18 Hnh 2: Giao di&n c a Yahoo................................................................................................ 19 Hnh 3: Giao di&n c a Ask Jeeves ........................................................................................ 20 Hnh 4: Giao di&n c a AllTheWeb ....................................................................................... 21 Hnh 5: Giao di&n c a Teoma ............................................................................................... 22 Hnh 6: Giao di&n HotBot ..................................................................................................... 23 Hnh 7: Giao di&n c a Altavista............................................................................................ 24 Hnh 8: Giao di&n c a Lycos................................................................................................. 25 Hnh 9: Kin trc t+ng c a Semantic web........................................................................... 36 Hnh 10: Mt Ontology n gin......................................................................................... 46 Hnh 11: M hnh d% li&u RDF............................................................................................. 51 Hnh 12 : Tiu chu,n nh gi tnh bo m*t c a eDoc ...................................................... 71 Hnh 13 Cc quan h& c php v rng buc ng% ngh!a ..................................................... 76 Hnh 14 Cy quyt -nh trong vi&c ch)n ngh!a ph hp. .................................................. 78 Hnh 15: Dng c s$ tm kim Web ................................................................................... 91 Hnh 16: M hnh ngh- cho 'ng dng tm kim ng% ngh!a trn l!nh v#c eDoc .......... 97 Hnh 17: Qui trnh x l c a t+ng search engine ................................................................ 99 Hnh 18: Gii thu*t x l ti li&u: ...................................................................................... 100 Hnh 19: Gii thu*t rt trch siu d% li&u.......................................................................... 103 Hnh 20: S . d% li&u quan h& c a 'ng dng.................................................................. 108 Hnh 21: Giao di&n chnh c a 'ng dng............................................................................ 112 Hnh 22: Giao di&n kt qu tm kim c a 'ng dng......................................................... 113 Hnh 23: Giao di&n qun l ti nguyn ............................................................................... 113 Hnh 24: Kin trc c a RDF Gateway............................................................................... 130 Hnh 25: Giao di&n c a RQF Query Analyzer. ................................................................. 136

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 8 - 0112398 T Th Ngc Thanh

    DANH MC CC T/ VIT T0T

    eDoc Electronic document

    eLib Electronic library

    eLearning Electronic learning

    www World Wide Web

    URI Uniform Resource Identifier

    URL Uniform Resource Locator

    HTTP Hypertext Transfer Protocol

    RDF Resources Descriprion Framework

    OIL Ontology Inference Language

    OWL Ontology Web Language

    XML eXtensible Markup Language

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 9 - 0112398 T Th Ngc Thanh

    DANH MC CC THUT NG1

    Class Lp

    Property Thuc tnh

    Metadata Siu d liu

    Subject Ch$ , ch$ ng

    Title Tiu

    Namespace Khng gian tn

    Predicate V ng

    Triple B ba (subject, predicate, object)

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 10 - 0112398 T Th Ngc Thanh

    M( )U

    Hin nay, hu ht cc h thng tm kim trn Internet u i theo hng truyn

    thng l tm kim theo t kho ( key word ). Theo cch tm kim ny, khi ta g vo

    t cn tm, cc h thng tm kim s* hin th cc ti liu m trong n c cha t kho

    cn tm. Do , kt qu tr ra l mt danh sch r#t nhiu cc ti liu, m c th cc ti

    liu ny khng lin quan g n ni dung ta cn tm. V i khi cc h thng ny

    khng a ra ht cc ti liu cn thit, tc l tha ti liu khng cn thit nhng li

    thiu h+n nhng ti liu quan trng khc.

    V#n "t ra l ta phi xy dng mt h thng tm kim nh th no khc

    ph!c hin trng nu trn ?

    gii quyt v#n ny, ta cn xy dng h thng tm kim sao cho p ng

    y $ thng tin m ngi dng mong mun, ngh a l phi xy dng h thng tm

    kim theo ng ngh a da trn thng tin ngi dng a vo.

    T nhn thc trn chng em quyt nh chn ti: Tm kim ng% ngh!a 'ng

    dng trn l!nh v#c eDoc (nhng ti liu in t% ting Anh) vi m!c ch tm hiu v

    xy dng mt cng c! tm kim theo ng ngh a c th tm kim thng tin chnh xc

    v y $, c th hn ch c phn no v#n tm kim theo t kho c$a cc

    search engine hin ti.

    Cc i tng nghin cu lin quan n ti: eDoc, Semantic Web, RDF,

    OWL, Metadata,.

    Trong phm vi ti, v thi gian thc hin ngn, nn chng em ch& th% nghim

    chng trnh tm kim trong mt s l nh vc: Khoa hc my tnh (Computer Science),

    Ngh thut (Art). Hai l nh vc ny c v, nh khng lin h vi nhau nhng thc t

    vn c nhng trng hp cn phi phn bit, v d! nh ti liu v ngh thut lp

    trnh (Art of programming) th phi phn ti liu v l nh vc khoa hc my tnh

    ch khng phi ngh thut . Tm li, ng d!ng m chng em xy dng ch& tm kim

    thng tin trong cc l nh vc nu trn. Tuy nhin, ng d!ng c th d- dng m rng ra

    nhiu l nh vc cn li.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 11 - 0112398 T Th Ngc Thanh

    Chng 1 : TNG QUAN

    1.1. t v(n

    Nhu cu tm kim, nm bt thng tin l mt nhu cu khng th thiu trong i

    sng c$a mi ngi. Khi vic s% d!ng World Wide Web tr nn ph. bin rng

    khp, th cng vic c$a cc search engine c/ng tr thnh mt phn sng cn v c li

    ch cho Web. Cc cng c! tm kim tr thnh nhng cng c! cng cng cho mi

    ngi dng c$a Internet; Google v Yahoo, c/ng tr thnh nhng ci tn quen thuc.

    Cc cng c! tm kim hin nay da trn mt trong hai dng c$a cng ngh tm

    kim Web: tm kim do con ngi t ch& ng dn v tm kim t ng.

    Cng c! tm kim do con ngi ch& ng dn s% d!ng mt c s d liu c$a

    cc t kho, cc khi nim, v cc tham chiu. Nhng cng c! tm kim theo t kho

    tr v mt dy cc trang, nhng phng php n gin ny thng dn n hng lot

    cc kt qu khng lin quan v khng xc thc. Hot ng c$a mt cng c! tm kim

    da trn ni dung l: s* m s lng cc t truy v#n ( cc t kho) so vi cc t hin

    din trong mi trang c cha trong ch& m!c c$a n. Sau , cng c! tm kim ny s*

    sp xp cc trang. Tip cn phc tp hn b0ng cch a cc v tr c$a t kho vo mt

    mc quan trng c! th. V d!, cc t kho xu#t hin trong th, title c$a trang web th

    quan trng hn trong phn body. Cc kiu khc c$a cng c! tm kim do ngi dng

    ch& ng dn, nh Yahoo, s% d!ng cc lc 1 ch$ gip ch& hng tm kim v

    tr v cc kt qu c lin quan hn. Nhng lc 1 ch$ ny do con ngi to ra.

    Bi l do ny, chng ta phi tn chi ph to ra v duy tr trong cc t mang ngh a

    thi gian (thay .i theo thi gian), v r1i th khng c cp nht thng xuyn nh

    cc h thng t ng.

    Cch tip cn tm theo t kho vn cn mt s hn ch, iu ny lm gim

    i tnh ng n c$a cc search engine. V d! nh cc t 1ng m khc ngh a (ch+ng

    hn: bank (ngn hng), bank (b sng), ) ho"c cc t c cc bin th khc nhau do

    c cc tin t v hu t nh student v students; small, smaller, smallest; . Ngoi ra,

    cc search engine khng tr v cc ti liu c cc t 1ng ngh a vi cc t trong cu

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 12 - 0112398 T Th Ngc Thanh

    truy v#n m ngi dng nhp vo. Key word khng $ biu di-n chnh xc nhu

    cu c$a ngi dng c/ng nh ni dung cc trang web, hn ch ny lm cho cc search

    engine tr v nhng ti liu khng lin quan n v#n m ngi dng quan tm. Bi

    v t*p hp cc t" kha l dng biu di-n s lc nh#t c$a ni dung, v do , cch

    biu di-n ny l mt dng gc nhn lun l (logical view) c$a ni dung mang m'c

    thng tin th(p nh(t, chnh l l do c bn khin cho cc Search Engine hin nay

    c t2 l& s trang web h%u ch trn tng s trang web tr v th(p.

    Google vi 400 triu ti liu thu v mi ngy v trn 8 t& trang web c lp ch&

    m!c, v l cng c! tm kim thng d!ng nh#t c s% d!ng ngy nay, nhng thm ch

    vi Google vn cn c nhiu v#n . V d!, b0ng cch no bn tm kim ch& vi mt

    lng t d liu m bn cn trong mt bin kt qu khng lin quan c a ra?

    Khi cng ngh tr tu nhn to (Artificial Intelligence_AI) pht trin mnh, th

    v#n "t ra l lm th no a ra nhng phng php tm kim tt hn m c th

    thc s tin cy vo nhng kt qu tm kim . l xu hng c$a nhng cng c! tm

    kim da vo ng ngh a v cc agent tm kim theo ng ngh a. Mt cng c! tm kim

    ng ngh a tm kim cc ti liu c ngh a tng t nhau ch khng ch& nhng t ng

    tng t nhau. Web tr thnh mt mng ng ngh a, phi cung c#p nhiu siu d

    liu v ni dung c$a n, thng qua vic s% d!ng cc th, RDF (Resource Description

    Framework) v OWL (Ontology Web Language), cc th, ny s* gip thc hin a

    Web vo trong mng ng ngh a. Trong mng ng ngh a, ngh a c$a ni dung c

    th hin tt hn, v nhng lin kt logic c thc hin gia nhng thng tin lin quan

    nhau.

    Cng c! tm kim ng ngh a, chng ta cp y, c hai u im ln so vi

    cc cng c! tm kim truyn thng:

    1. N ch#p nhn cc truy v#n c pht biu ngn ng t nhin.

    2. Kt qu l tm kim mt mu thng tin; khng phi l mt danh sch cc ti

    liu c th (ho"c khng) cha thng tin yu cu.

    Tht vy cng c! tm kim ng ngh a bt u vi lng thng tin qu ti. N

    tip nhn mt s cc tc v! khng c ai a thch trong vic tm kim thng tin hin

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 13 - 0112398 T Th Ngc Thanh

    nay: m ra mi ti liu c$a danh sch kt qu v qut n mt cch th$ cng l#y

    thng tin. Theo cch , cc cng c! tm kim ng ngh a c kh nng cch mng ho,

    hng n vic tm kim thng tin in t% mt cch t ng: n thay .i m hnh tm

    kim t vic thu hi ti liu n vic tr li cu hi.

    1.2. Bi ton gii quyt

    Theo thng k trong nm 2001: Cc nhn vin tn trung bnh 8 gi mt tun,

    hay 16% gi cng hng tun c$a h, tm kim v s% d!ng ni dung thng tin bn

    ngoi. Chi ph lng ch& ring cho cng ty c$a M l 107 t& la mt nm. Vic tm

    kim ng ngh a l mt c hi y ngh a cho cc cng ty gip cho nhn vin c$a h

    c kh nng hn v hiu qu hn trong vic "t thng tin bn ngoi vo cng vic c$a

    h. Khng cn ni nhiu thm na. S qu ti thng tin l mt v#n ln trong x

    hi thng tin.

    Nhng khm ph tng t c/ng c tm th#y trong nhiu nghin cu, lm n.i

    bt v#n : phi a ra gii php trong vic ci tin x% l tm kim thng tin. Ngoi tr

    nhng ch li to ln m cc cng c! tm kim mang li cho chng ta nhng nm gn

    y b0ng vic lm cho c th truy cp n hng triu cc ti liu, b#t ch#p v tr vt l

    v ngn ng, th chng vn c mt s hn ch c bn. V d!, chng khng hiu cc

    t con ngi g vo v do t ti mt s lng kh.ng l1 c$a cc kt qu sai. Hn

    na, chng hot ng hiu qu khi h2i v nhng s kin, ch+ng hn nh Kerry v

    vua c$a Ty Ban Nha. Tuy nhin, chng thc hin nhiu kt qu khng tt nu cu

    truy v#n ni v s lin h gia cc khi nim ch+ng hn nh Nhng quc gia no

    tham gia trong chin tranh Iraq? v t.ng thng nc Php theo chnh ng no?

    C ba v#n cn c ci tin ci thin cc kt qu c$a cng c! tm kim l:

    (i) Cng c! tm kim cn cho php nhng truy v#n phc tp hn (v

    d! trong ngn ng t nhin),

    (ii) Cng c! tm kim cn hiu nhng g con ngi h2i, v

    (iii) Cng c! tm kim phi cung c#p cu tr li cho truy v#n (c th

    sao lu li nhng lin kt n cc ti liu m cho ra cu tr li).

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 14 - 0112398 T Th Ngc Thanh

    1.3. Hng tip c*n

    C hai tip cn ci thin cc kt qu tm kim thng qua phng php ng

    ngh a:

    1. Kin trc c$a Semantic Web.

    2. Lp ch& m!c cho ng ngh a tim tng (Latent Semantic Indexing).

    Tuy nhin, hu ht cc cng c! tm kim da trn ng ngh a phi chu nhng

    v#n thc thi bi qui m c$a mng ng ngh a r#t ln. Nh0m m!c ch lm cho tm

    kim ng ngh a tr nn hiu qu trong vic tm kim cc kt qu mong mun, mng

    ny phi cha mt lng ln cc thng tin lin quan. Cng lc , mt mng rng ln

    to ra nhng kh khn trong vic x% l nhiu ng dn c th c cho mt gii php

    lin quan.

    Chng ta s% d!ng kha cnh sc bn c$a cng ngh Web ng ngh a kt hp

    ch"t ch* s phi hp c$a cc cng ngh tin tin lm cho m hnh c th chuyn

    nhanh trong vic tm kim thng tin.

    Cng ngh& x l ngn ng% t# nhin cho php ngi dng h2i nhng

    cu h2i m h mun, hn l phi nu ln nhng t kho c lin quan

    trong cu h2i c$a h.

    Cc Ontology -nh ngh!a l!nh v#c quan tm. Chng c xem nh l

    b no c$a cng c! tm kim, bi v n c gng hiu nhng cu truy

    v#n c$a ngi dng trong cc t c$a ontology ny. Theo cch ny ch

    r0ng cng c! tm kim ng ngh a c$a chng ta khng phi l c m!c

    ch thng thng nh Google, m n c nh p d!ng i vi mt

    l nh vc hay khu vc c! th (v d! v l nh vc php l, vn ho, th thao

    v.v).

    Phn tch tri th'c. Cng ngh ny chuyn d liu khng c c#u trc

    sang thng tin c c#u trc. N rt trch thng tin t cc vn bn t do,

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 15 - 0112398 T Th Ngc Thanh

    cc vn bn bn c#u trc v c#u trc pht sinh ra ontology vi tri thc

    tht s.

    Truy c*p tri th'c thng minh. Cc cu tr li cho cc truy v#n t

    c do vic truy v#n ontology c a ra t ng, v c biu di-n

    trong nhng dng khc nhau:

    o D liu c$a thc th chnh c h2i n (v d! trong l nh vc

    x hi, d liu c$a mt ngh s ).

    o nh hng ng ngha. Nhng t c$a cc cu tr li c t

    ng siu lin kt n cc khi nim ontology con, cho php nh

    hng b0ng ngh a.

    o Cc th thng minh v lin kt thng minh. Cc cu tr li lun

    c sao lu bi cc ngu1n v cc ti liu chng da vo. Khi

    nhng ti liu c tra cu, th phn mm gn th, v lin kt

    s* t ng nhn ra cc t cha ngh a l nh vc v lin kt chng

    n ontology, hay thm vo cc th, thng minh vi nhng hot

    ng c nh ngh a trong ontology.

    o S tng tng thng minh. Thng thng, cc cu tr li

    pht sinh ra nhiu cc khi nim lin quan v cc mi quan h.

    Phm mm tng tng thng minh cho php mt khi nim i

    xuyn qua tri thc ny.

    C mt v#n m cng c! tm kim ng ngh a c nh ngh a y vn cha

    th hon t#t so vi nhng cng c! tm kim vi m!c ch thng thng (khng c ng

    ngh a) nh Google l: phm vi. Trong Google bn c th tm kim vi b#t k3 t

    kho no trong b#t k3 l nh vc no. Nu cc t kho xu#t hin trong mt s ti liu

    trn Web, Google s* tm th#y n. Mt cng c! tm kim ng ngh a cn mt s tri thc

    nng cao: n cn bit ngh a, c biu di-n trong mt ontology. Thc t l cc

    ontology trong trng thi thi hnh hin ti vn cn lm b0ng th$ cng, hn ch

    chng trong nhng m!c ch thng thng. Do , cc cng c! tm kim ng ngh a l

    nhng cng c! quan trng cho nhng l nh vc c! th. Trong trng hp ny, m!c ch

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 16 - 0112398 T Th Ngc Thanh

    c$a cc cng c! tm kim ng ngh a l b. sung cho cc cng c! tm kim thng

    thng, hn l cnh tranh nh nhng i th$ .

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 17 - 0112398 T Th Ngc Thanh

    Chng 2 : C S L THUYT

    2.1. Chin lc tm kim thng tin c a cc b tm kim (Search Engine)

    2.1.1. Mt s search engine thng dng:

    Sau y l danh sch mt s search engine. Ti sao chng c xem l nhng

    search engine ln? l bi v chng c bit n nhiu v s% d!ng tt. i vi

    cc chuyn gia web, cc cng c! tm kim ln l danh sch nhng ni quan trng nh#t

    bi chng pht sinh ra mt lng r#t ln cc trang web tim tng. i vi nhng

    ngi tm kim, cc cng c! tm kim ph. bin thng tr ra cc kt qu ng tin cy

    hn. Nhng search engine ny r#t c th c duy tr tt v nng c#p khi cn thit,

    gi th cn b0ng vi tc pht trin c$a web.

    Nhng search engine sau l t#t c nhng la chn tt nh#t bt u khi tm kim

    thng tin:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 18 - 0112398 T Th Ngc Thanh

    2.1.1.1. Google: http://www.google.com/

    Hnh 1: Giao di&n c a Google

    Nguyn thu4, Google l mt n c$a trng i hc Stanford c thc hin

    bi hai sinh vin Larry Page v Sergey Brin gi l BackRub. n nm 1998, th .i

    tn thnh Google, v 1 n ny tr thnh cng ty ring Google "t ti khun vin

    trng i hc. N vn cn c lu gi cho n ngy nay.

    Google l cng c! tm kim n.i ting, tt nh#t trong cc la chn tm kim

    thng tin trn web. Dch v! da vo crawler, spider cung c#p trang web vi thng tin

    a ra ton din cng vi mc lin quan tt. y l cng c! tt nh#t hin nay trong

    vic tm kim b#t c th g bn mun.

    Tuy nhin, Google cung c#p chn la tm kim ch$ yu v cc trang web.

    S% d!ng hp tm kim trn trang ch$ Google, bn c th d- dng nh v cc nh qua

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 19 - 0112398 T Th Ngc Thanh

    web, nhng ngh c "t trong cc nhm tho lun Usenet, nh v thng tin tin

    tc hay thc hin tm kim sn ph5m.

    2.1.1.2. Yahoo: http://www.yahoo.com/

    Hnh 2: Giao di&n c a Yahoo

    a ra nm 1994, Yahoo l th m!c c/ nh#t c$a web, mt ni m cc nh

    bin tp t. chc cc trang web trong cc danh m!c. Tuy nhin, vo thng 10 nm

    2002, Yahoo chuyn sang lp danh sch da vo crawler cho nhng kt qu chnh c$a

    n. Cng c! ny s% d!ng cng ngh t Google cho n thng 2 nm 2004. Hin nay,

    Yahoo s% d!ng cng ngh tm kim ring c$a mnh.

    Yahoo Directory vn t1n ti. Bn s* ch& ra cc lin kt danh m!c pha di

    mt s cc trang web lit k trong kt qu tr v c$a mt tm kim t kho. Khi c

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 20 - 0112398 T Th Ngc Thanh

    xu#t, nhng trang web ny dn bn n mt danh sch cc trang web c xem

    xt v ph chu5n bi mt nh bin tp.

    Cng ngh AltaVista v AllTheWeb c phi hp vi k thut Inktomi, mt

    cng c! tm kim da trn crawler, to nn mt Yahoo crawler hin nay.

    2.1.1.3. Ask Jeeves: http://www.askjeeves.com/

    Hnh 3: Giao di&n c a Ask Jeeves

    Ask Jeeves bt u n.i ting t nm 1998 v 1999, c bit nh l mt cng

    c! tm kim ngn ng t nhin cho php ta tm kim b0ng cch h2i nhng cu h2i

    v tr v kt qu vi nhng g c v l tr li ng v mi th.

    Thc s, cng ngh khng phi l nhng g lm cho Ask Jeeves thc thi tt.

    Bn cnh cc bi cnh, cng c! ny ti mt thi im c khong 100 trnh son tho

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 21 - 0112398 T Th Ngc Thanh

    gim st cc log tm kim. Sau chng vo trong web v nh v nhng site m

    chng cho l tt nh#t tng xng vi cc truy v#n ph. bin nh#t.

    2.1.1.4. AllTheWeb: http://www.alltheweb.com/

    Hnh 4: Giao di&n c a AllTheWeb

    c Yahoo cung c#p ngu1n, c th th#y AllTheWeb l mt tm kim thun

    tu (pure search) nh nhng hn, tu3 bin hn v d- chu hn l khi thc hin

    Yahoo. Tiu im l trong tm kim web, ngoi tr tin tc, tm kim hnh nh, video,

    MP3 v FPT c/ng c a ra.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 22 - 0112398 T Th Ngc Thanh

    2.1.1.5. Teoma: http://www.teoma.com/

    Hnh 5: Giao di&n c a Teoma

    Teoma l mt cng c! tm kim da trn crawler c s hu bi Ask Jeeves.

    N c s lng trang web c ch& m!c nh2 hn Google v Yahoo. Nm 2000,

    Teoma ra i cng vi thnh cng c$a mnh: a ra c nhng th lin quan. Tnh

    nng Refine c$a cng c! ny xu#t ra nhng ch$ kho st sau khi bn thc

    hin mt tm kim.

    Teoma c Ask Jeeves mua vo thng 9 nm 2001 v c/ng cung c#p mt s

    kt qu cho web site ny.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 23 - 0112398 T Th Ngc Thanh

    2.1.1.6. HotBot: http://www.hotbot.com/

    Hnh 6: Giao di&n HotBot

    HotBot h tr truy cp d- dng n 3 trang web search engine da vo crawler

    ln: Yahoo, Google, v Teoma. Khng nh mt meta search engine, n khng th pha

    trn cc kt qu t t#t c cc crawler ny vi nhau. Do , n l mt cch nhanh, d-

    dng l#y cc kin tm kim web khc nhau trong mt ni.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 24 - 0112398 T Th Ngc Thanh

    2.1.1.7. AltaVista: http://www.altavista.com/

    Hnh 7: Giao di&n c a Altavista

    AltaVista c a ra vo thng 9 nm 1995 v c xem nh l Google

    trong mt vi nm, n cung c#p nhng kt qu lin quan v c mt nhm ngi

    dng yu thch cng c! tm kim ny. Nhng t sau nm 1998, ngi ta khng cn a

    chung AltaVista na, bi v s mi m, c$a cc danh sch AltaVista v tin tc c

    a ra c$a crawler trong trang web ny khng c cp nht thng xuyn.

    Ngy nay, AltaVista mt ln na tp trung vo tm kim. Cc kt qu n t

    Yahoo, v cho php n cc trang web tm hnh nh, MP3/Audio, Video, cc danh

    sch danh m!c con ngi v cc kt qu tin tc. Nu mun mt cm gic nh nhng

    hn Yahoo nhng vn c cc kt qu c$a Yahoo, AltaVista l mt chn la tt.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 25 - 0112398 T Th Ngc Thanh

    2.1.1.8. Lycos: http://www.lycos.com/

    Hnh 8: Giao di&n c a Lycos

    Lycos l mt trong nhng cng c! tm kim c/ nh#t trn web, c a ra nm

    1994. c m t nh l nhng c.ng truy cp web ( web portal ) hay nhng trung tm

    truy cp, l ni m ngi dng i vo l#y thng tin cho mi l nh vc, k c tn gu,

    gi th in t%,

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 26 - 0112398 T Th Ngc Thanh

    Search

    Engine

    Google AlltheWeb AltaVista Teoma

    Database google.com alltheweb.com altavista.com teoma.com

    Kch thc(#

    trang )

    Khong 8 t& (1

    t& khng nh

    ch& m!c trn

    ton vn bn)

    Khong 3 t&,

    ch& m!c trn

    ton vn bn.

    Khong 1 t& Khong 1 t&

    a phng

    tin

    (multimedia)

    H tr H tr H tr Khng h tr

    Ton t%

    M"c nh AND AND AND AND

    Loi tr - - - -

    C!m t Dng d#u Dng d#u Dng d#u Dng d#u

    Rt gn Khng h tr

    Dng k t *

    thay th

    cho cc k t

    trong d#u

    Khng h tr Dng k t * Khng h tr

    Boolean OR (ch& dng

    cho danh t

    ring )

    AND, OR,

    ANDNOT,

    RANK, ()

    AND, OR,

    ANDNOT,

    NEAR, ()

    OR (ch& dng cho tn

    ring)

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 27 - 0112398 T Th Ngc Thanh

    Stop words Thng thng

    b2 qua cc t

    thng d!ng

    + nu mun

    tm v phi "t

    trong c"p d#u

    Dng d#u

    trong search

    c bn

    B2 qua trong

    search nng

    cao

    Thng thng b2 qua

    cc t thng d!ng

    + nu mun tm

    Danh t

    ring

    Khng h tr Khng h tr H tr Khng h tr

    Gii hn

    field cn tm

    intitle:

    inurl:

    allintitle:

    Allinurl:

    filetype:

    Link:site:

    Trong search

    nng cao :

    cache:info:

    Normal.title:

    url.all:

    Link.all:

    Link.extension

    :

    Title:

    domain:

    Link:

    image:

    Text:

    url:

    host:

    Anchor:

    applet:

    intitle:

    inurl:

    site:

    geoloc:

    lang:

    last:

    afterfate:

    Cc "c tnh

    "c bit

    ~ tm t 1ng

    ngh a

    Gii hn bi

    ngn ng

    Nhiu kiu file

    : pdf, doc,

    Caches : trang

    web khi nh

    ch& m!c

    Duyt qua cc

    URL

    Trong tm

    nng cao :

    gii hn bi

    ngy, domain,

    a ch& iP

    Gii hn bi

    ngy, v tr,

    ngn ng

    Trong tm

    nng cao : s%

    d!ng sortby

    lc v sp xp

    kt qu.

    Dng refine ti u

    kt qu.

    Resource c c

    cc trang v lin kt

    tp trung trn ch$

    cn tm.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 28 - 0112398 T Th Ngc Thanh

    'u im

    'u im

    chnh

    R#t tt vi

    nhng trang

    c ph. bin

    cao.

    Cc trang tin

    tc gn y

    Tt nh

    Google.

    Khng c

    stopword.

    Dng nhiu

    ton t%

    Boolean trong

    tm kim.

    Trong tm

    nng cao h

    tr hin th kt

    qu theo

    ph. bin c$a

    t.

    Tnh ph. bin tt,

    da vo s lng

    trang web cng ch$

    vi cc trang ang

    xt. Thng t kt

    qu ng khch l.

    Search

    Engine

    Google AlltheWeb AltaVista Teoma

    Bng 1 : Bng hng dn nhanh v cch s dng mt s search engine ph bin

    Search

    engine

    C s$ d% li&u Ton t L#a ch)n tm

    kim

    Linh tinh

    Google

    http://www.g

    oogle.com

    H tr tm

    kim nng

    cao

    H thng th

    m!c ch$

    (Subject

    Ton vn bn

    c$a cc trang

    web, .pdf,

    .doc, .xls, .ps,

    .wpd

    (4.3B, + 1B

    mt phn c$a

    ch& m!c

    URLs)

    AND (m"c

    nh)

    OR (danh t

    ring)

    + cho cc stop

    word thng

    d!ng, cho cc

    URL ho"c cc

    trang c! th (v

    Dng * rt

    gn.

    Dng tm c!m

    t.

    Fields : intitle:,

    inurl:, link:, site:

    Tm trn h

    thng danh m!c

    cc ch$ trong

    Kim li chnh

    t.

    Lu tr cc trang

    lp ch& m!c.

    Tt cho tm cc

    trang hay b li

    404.

    Phin dch n 5

    ngn ng.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 29 - 0112398 T Th Ngc Thanh

    Directory)

    H thng th

    m!c m

    (Open

    Directory)

    Tin tc : cp

    nht thng

    xuyn (4500

    ngu1n ).

    Cc dng file

    nh

    Nhm :

    Usenet t

    1981 n nay

    d! +edu)

    - loi tr

    th m!c web.

    Tm cc trang

    web tng t.

    ~ tm t 1ng

    ngh a.

    AlltheWeb

    http://allthew

    eb.com

    H tr tm

    kim nng

    cao

    Ton b vn

    bn cc trang

    web, .pdf,

    Flash,

    (3.1B ton b

    ch& m!c

    URLs)

    Tin tc : cp

    nht thng

    xuyn (3000

    ngu1n)

    Tranh nh

    Video

    Audio

    FPT

    AND (m"c

    nh)

    OR, phi "t

    cc t trong

    d#u .

    ANDNOT,

    RANK

    - loi b2

    Khng rt gn.

    Dng d#u cho

    c!m t.

    Field intitle:inurl:

    link:site:

    Trong tm nng

    cao :

    gii hn theo

    ngy, ngn ng,

    domain, file

    format, a ch&

    iP.

    Kim li chnh

    t.

    Tm nng cao :

    tranh nh, video.

    H tr s% d!ng

    k thut

    clusters ti

    u cu truy v#n.

    AltaVista

    http://altavist

    a.com

    Ton b vn

    bn cc trang

    web (khong

    AND (m"c

    nh)

    Trong tm nng

    D#u * rt gn.

    D#u cho c!m

    t.

    Kim li chnh

    t.

    Phin dch : 8

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 30 - 0112398 T Th Ngc Thanh

    H tr tm

    kim nng

    cao

    H thng th

    m!c ch$

    (Subject

    Directory )

    H thng th

    m!c m

    (Open

    Directory)

    1B) v file

    .pdf.

    Tin tc (3000

    ngu1n), nh,

    MP3/Audio,

    Video.

    cao ho"c danh

    t ring trong

    tm c bn :

    AND, OR,

    ANDNOT,

    NEAR, d#u ()

    l1ng nhau.

    - cho loi tr.

    Tm nng cao :

    gii hn ngy,

    ngn ng.

    ngn ng c$a

    Chu u & cc

    ngn ng c$a

    Chu .

    AltaVistaPrima :

    ti u cu h2i.

    Teoma

    http://teoma.c

    om

    H tr tm

    kim nng

    cao

    Ton b vn

    bn trang web

    (khong 1B)

    AND (m"c

    nh)

    OR (danh t

    ring)

    + ho"c cho

    stopword

    - loi b2

    Khng rt gn.

    Dng d#u cho

    c!m t.

    Field intitle:inurl:

    site:geoloc:lang:l

    ast:

    afterdate:befored

    ate:

    betweendate:

    Trong tm nng

    cao :

    gii hn theo

    ngy, ngn ng,

    domain, file

    format, a ch&

    iP.

    Kim li chnh

    t.

    Gom nhm kt

    qu Refine ti

    u cu h2i.

    Resource c

    cc trang ho"c

    lin kt tp trung

    vo ch$ .

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 31 - 0112398 T Th Ngc Thanh

    AskJeeves

    www.ask.co

    m

    Nhn kt qu

    t CSDL c$a

    Teoma.

    Tm sn ph5m

    :

    PriceGrabber.

    com,

    Tm tranh nh

    :

    Picsearch.co

    m

    Tm tin tc :

    Moreover.co

    m.

    Ging Teoma.

    i vi nhng

    cu h2i n

    gin, xu#t hin

    c%a s. i

    thoi.

    Ging Teoma.

    Click vo

    Remove Frame

    th#y URLs

    c$a cc trang.

    Kim li chnh

    t.

    AskJeeves for

    Kids

    www.ajkids.c

    om

    Tr li tt cc

    cu h2i n

    gin.

    Games cho

    tr, em,

    Tin tc theo

    tng nhm

    tu.i.

    H2i b0ng ngn

    ng t nhin.

    Khng s% d!ng

    cc ton t%

    Boolean.

    Click vo No

    frames th#y

    URL c$a trang

    kt qu.

    Dn n cc

    trang ph!c v!

    hc tp : t in,

    vt l, khoa hc,

    bn 1, lch

    s%,

    Yahoo

    http://dir.yaho

    o.com

    Xem xt cc

    trang web

    (khong 13K)

    AND (m"c

    nh)

    OR

    C!m t :

    Rt gn : *

    Fields t: title,

    u:URL

    Nhiu dch v!

    trong Yahoo:

    Tin tc : tng

    gi.

    Th thao :t& s,..

    Bn 1, thi tit,

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 32 - 0112398 T Th Ngc Thanh

    mua sm.

    Bng 2: S lc v cc c trng c a mt s search engine thng dng trn Internet

    2.1.2. Chin lc tm kim

    T search engine thng c s% d!ng rng ri m t cc cng c! tm

    kim da trn crawler v cc th m!c do con ngi cung c#p. y l hai loi c$a cc

    search engine tp hp cc danh sch c$a chng trong nhng cch khc nhau hon

    ton.

    Search engine da vo crawler g1m 3 phn:

    B thu th*p thng tin Robot

    Robot l mt chng trnh t ng duyt qua cc c#u trc siu lin kt thu

    thp ti liu v mt cch quy n nhn v t#t c cc ti liu c lin kt vi ti liu

    ny.

    Robot c bit n di nhiu tn gi khc nhau : spider, web wanderer ho"c

    web worm, crawler Nhng tn gi ny i khi gy nhm ln, nh t spider ,

    wanderer lm ngi ta ngh r0ng robot t n di chuyn v t worm lm ngi ta

    lin tng n virus. V bn ch#t robot ch& l mt chng trnh duyt v thu thp

    thng tin t cc site theo ng giao thc web. Nhng trnh duyt thng thng khng

    c xem l robot do thiu tnh ch$ ng, chng ch& duyt web khi c s tc ng c$a

    con ngi.

    B l*p ch3 mc Index

    H thng lp ch& m!c hay cn gi l h thng phn tch v x% l d liu, thc

    hin vic phn tch, trch chn nhng thng tin cn thit (thng l cc t n , t

    ghp , c!m t quan trng) t nhng d liu m robot thu thp c v t. chc thnh

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 33 - 0112398 T Th Ngc Thanh

    c s d liu ring c th tm kim trn mt cch nhanh chng, hiu qu. H

    thng ch& m!c l danh sch cc t kho, ch& r cc t kho no xu#t hin trang no,

    a ch& no.

    B tm kim thng tin Search Engine

    Search engine l c!m t dng ch& ton b h thng bao g1m b thu thp

    thng tin, b lp ch& m!c v b tm kim thng tin. Cc b ny hot ng lin t!c t

    lc khi ng h thng, chng ph! thuc ln nhau v m"t d liu nhng c lp vi

    nhau v m"t hot ng.

    Search engine tng tc vi user thng qua giao din web, c nhim v! tip

    nhn v tr v nhng ti liu tho yu cu c$a user.

    Ni nm na, tm kim t l tm kim cc trang m nhng t trong cu truy v#n

    (query) xu#t hin nhiu nh#t, ngoi tr stopword (cc t qu thng d!ng nh mo t a,

    an, the,). Mt t trong cu truy v#n cng xu#t hin nhiu trong mt trang th trang

    cng c chn tr v cho ngi dng. V mt trang cha t#t c cc t trong cu

    truy v#n th tt hn l mt trang khng cha mt ho"c mt s t. Ngy nay, hu ht

    cc search engine u h tr chc nng tm c bn v nng cao, tm t n, t ghp,

    c!m t, danh t ring, hay gii hn phm vi tm kim nh trn m!c, tiu , on

    vn bn gii thiu v trang web,..

    Ngoi chin lc tm chnh xc theo t kho, cc search engine cn c gng

    hiu ngh a thc s c$a cu h2i thng qua nhng cu ch do ngi dng cung c#p.

    iu ny c th hin qua chc nng s%a li chnh t, tm c nhng hnh thc bin

    .i khc nhau c$a mt t. V d! : search engine s* tm nhng t nh speaker,

    speaking, spoke khi ngi dng nhp vo t speak.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 34 - 0112398 T Th Ngc Thanh

    Nguyn l ho4t ng

    Search engine iu khin robot i thu thp thng tin trn mng thng qua cc

    siu lin kt ( hyperlink ). Khi robot pht hin ra mt site mi, n gi ti liu (web

    page) v cho server chnh to c s d liu ch& m!c ph!c v! cho nhu cu tm kim

    thng tin.

    Bi v thng tin trn mng lun thay .i nn robot phi lin t!c cp nht cc

    site c/. Mt cp nht ph! thuc vo tng h thng search engine. Khi search engine

    nhn cu truy v#n t user, n s* tin hnh phn tch, tm trong c s d liu ch& m!c

    v tr v nhng ti liu tho yu cu.

    2.2. Semantic Web

    2.2.1. Khi ni&m

    Web ng ngh a l mt dng m rng c$a web hin nay, m cho php ta truy

    tm, chia s,, phi hp, s% d!ng li v rt trch thng tin mt cch chnh xc, d- dng.(

    Tim Berners Lee, XML 2000 ).

    Web ng ngh a l mt mng li thng tin c lin kt theo cch m my tnh

    c th d- dng x% l c trn quy m ton cu. Chng ta c th xem web ng ngh a

    nh l mt c s d liu ton cu c lin kt vi nhau.

    Web ng ngh a c pht trin bi Tim Berners Lee, nh pht minh c$a

    WWW, URIs, HTTP, v HTML. Hin nay c mt nhm nghin cu ti tp on

    WWW ang ci tin, m rng v tiu chu5n ho h thng ng ngh a.

    D liu trong tp tin HTML thng hu ch trong mt s trng hp. Phn ln

    d liu trn web l dng HTML nn kh s% d!ng trn quy m ln, bi v n khng c

    mt h thng ton cu xu#t bn d liu.

    Do , Web ng ngh a c xem nh l mt gii php k thut.

    Web ng ngh a c xy dng ch$ yu trn c php s% d!ng URIs biu di-n

    d liu, thng th#y l c#u trc da trn b ba (subject, predicate, object), v d!: nhiu

    b ba c$a d liu URI c th c c#t gi trong c s d liu, ho"c thay th ln nhau

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 35 - 0112398 T Th Ngc Thanh

    trn word wide web b0ng cch s% d!ng mt tp cc c php "c bit c php trin

    chuyn bit ph!c v! cho nhim v! . C php ny c gi l c php RDF.

    Web ng ngh a yu cu d liu khng nhng my c th c c m cn

    mong mun my c th hiu c. Trch dn cu ni c$a Tim Berners Lee:

    The semantic web goal is to be a unifying system which will (like the web for

    human communication) be as un-restraining as possible so that the complexity of

    reality can be described.

    Tm dch l: M!c ch c$a web ng ngh a l mt h thng hp nh#t (ging

    nh web dnh cho s giao tip c$a ngi) cng khng b cn tr cng tt m

    phc tp c$a thc t c th c m t.

    Vi web ng ngh a, n s* d- dng nhn bit ton b phm vi c$a cc cng c!

    v ng d!ng kh gii quyt trong khun kh. c$a web hin ti.

    Hai cng ngh quan trng cho vic pht trin semantic web l: eXtensible

    Markup Language (XML) v Resource Description Frameword (RDF). XML cho

    php mi ngi c th to ra cc tag (th, ) c$a ring mnh. Cn RDF th trnh by ng

    ngh a, RDF s% d!ng tp cc triple m t cc khi nim c s.

    URI ( Uniform Resource Identifier):

    Mt URI n gin dng nhn bit mt trang web: ging nh cc chui bt

    u vi http hay ftp m bn thng th#y trn word wide web. B#t k3 ai c/ng c

    th to ra mt URI v quyn s hu chng c u4 quyn mt cch r rng, chnh v

    vy chng to nn c s quan nim xy dng web ton cu. Thc ra, word wide

    web c th xem nh l: b#t k3 th g m c URI c coi nh l on the web.

    Cc URIs l cc chui k t c th nhn bit cc ti nguyn trn web. Thng

    qua vic s% d!ng URIs, chng ta c th s% d!ng cng cch "t tn n gin tham

    chiu n cc ti nguyn di cc nghi thc (protocol) khc nhau nh l: HTTP, FTP,

    GOPHER, EMAIL, .

    URLs ( Uniform Resource Locator): l mt dng c s% d!ng rng ri c$a

    URIs, c s% d!ng r#t ph. bin trn web, l cc a ch& c$a cc ti nguyn. M"c d

    thng c bit n nh l cc URLs, nhng URIs c/ng c th c tham chiu n

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 36 - 0112398 T Th Ngc Thanh

    cc khi nim trong semantic web. V d!, gi s% bn c mt quyn sch c tn l

    Machine Learning, th URI c$a n s* nh sau:

    http://www.cs.bris.ac.uk/home/pw2538/book/title#machinelearning

    Lu l mi th trn web u c mt URI duy nh#t.

    2.2.2. Kin trc

    Web ng ngh a c xy dng theo m hnh kin trc phn tng g1m c 7

    tng, cc tng nh sau:

    Hnh 9: Kin trc t+ng c a Semantic web.

    Tng Unicode + URI:

    Nh0m bo m vic s% d!ng tp k t quc t v cung c#p phng tin nh

    danh cc i tng trong Web ng ngh a.

    Tng XML + NS + Lc 1 XML:

    Cng vi cc nh ngh a v namespace v schema bo m r0ng ta c th tch

    hp cc nh ngh a web ng ngh a vi cc chu5n da trn XML khc.

    Tng RDF + Lc 1 RDF:

    Dng siu d liu m t ti liu trn Web my c th hiu c chng.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 37 - 0112398 T Th Ngc Thanh

    Tng Ontology:

    Lc 1 RDF cung c#p cc cng c! nh ngh a nhng t vng, c#u trc v

    cc rng buc trong vic m t cho siu d liu v cc ti nguyn Web. Nhng lc

    1 RDF cha tht s y $ cho vic m hnh ho v h tr suy lun trn Semantic

    Web. Ngn ng Ontology OIL c ra l mt dng m rng c$a lc 1 RDF. N

    cho php th hin ng ngh a hnh thc, gip h tr suy di-n t ng.

    Tng Logic:

    Tng logic c xem nh l mt c s lut trn Semantic Web. Bn ch#t c$a c

    s lut ny c dng nh mt h chuyn gia. Tng ny s* h tr cc dch v! nh : phn

    loi vn bn, rt trch d liu.

    Tng Proof:

    Trong khi tng logic gip h tr suy lun da vo c s lut th tng Proof c

    dng chng minh cc suy di-n c$a h thng b0ng cch lin kt cc d kin.

    Tng Trust:

    Trong Web ng ngh a cc thng tin c s% d!ng chung nh mt c s d liu

    ton cu, nn cn phi c mt ci g bo mt. l nguyn nhn c$a s ra i

    c$a ch k in t%, n gip cho thng tin trn Web ng tin cy hn. Trust engine l

    mt h thng ang c xy dng da trn nn tng c$a ch k in t%. Cc k thut

    xy dng chng cn ang trong giai on nghin cu v th% nghim.

    2.2.3. Cc thch th'c t ra cho Semantic web

    2.2.3.1. Thch th'c 1: Tnh s5n c c a ni dung (The availability

    of content)

    Ni dung c$a web ng ngh a l ni dung web c ch thch theo cc ontology

    "c bit, cc ontology ny nh ngh a ng ngh a c$a cc t ho"c cc khi nim xu#t

    hin trong cng mt ni dung. Mt s m rng n gin i vi HTML l c dng

    ch thch cc trang web vi thng tin v ontology. Vic to ni dung semantic web

    l mt thch thc ln, bi v c s h tng c$a semantic web vn cn ang c xy

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 38 - 0112398 T Th Ngc Thanh

    dng (cha hon ch&nh RDF, OIL, DAML+OIL,), hin ti c r#t t ni dung web

    ng ngh a c s6n.

    2.2.3.2. Thch th'c 2: Cc ontology s5n c, pht tri6n v tin ho

    Cc ontology l cha kha i vi semantic web bi v chng l nhng b

    chuyn ch ng ngh a c cha trong semantic web, c ngh a l chng cung c#p mt

    tp t vng v ng ngh a ch thch. C 3 v#n chnh cn c gii quyt i vi

    thch thc ny, hai v#n u c lin quan n cc v#n v vic pht trin cc

    ontology truyn thng m cho n tn by gi cc v#n ny vn cha c gii

    quyt, v v#n th ba cn li c lin quan nhiu n khung cnh mi c$a semantic

    web:

    V#n th nh#t l vic xy dng cc ontology ht nhn (kernel) c s%

    d!ng bi t#t c cc domain. Nhng khi u t1n ti i vi vic xy dng mt s

    kernel ontology ny l chng phi c ng d!ng trong nhng domain khc nhau.

    V#n th hai l cung c#p s h tr mang tnh ch#t gii php v cng ngh i

    vi hu ht cc hot ng c$a tin trnh pht trin ontology, bao g1m:

    a. S thu thp tri thc, m hnh khi nim v m ho ontology trong cc

    ngn ng semantic web (RDFS, OIL, DAML+OIL), v cc ngn ng

    mi cc ngn ng mi ny c th s* c a ra trong nhng nm sp

    ti [Maedche, Staab 2001] .

    b. S sp xp v nh x ontology, s tch hp ontology, cc cng c!

    chuyn .i ontology, v cc cng c! xy dng ontology, nu cc

    ontology t1n ti sp c s% d!ng li [Fensel et al, 2001], [Noy, Musen

    2000].

    c. Cc cng c! kim tra tnh bn vng cho cc ontology c s% d!ng li

    [Gomez-Perez 1996].

    V#n th ba l s tin ho c$a cc ontology v mi quan h c$a chng i vi

    cc d liu c ch thch. Cc cng c! qun l c#u hnh l cn thit cho s iu

    khin cc phin bn c$a mi ontology c/ng nh s ph! thuc ln nhau gia chng v

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 39 - 0112398 T Th Ngc Thanh

    cc ch thch. T#t c cc v#n ny c th l khng quan trng lm, nhng cn thit

    phi gii quyt trc khi mt semantic web thc s ra i.

    2.2.3.3. Thch th'c 3: Scalability of semantic web content

    Mt khi chng ta c ni dung c$a semantic web, chng ta s* phi quan tm

    n vic phi qun l n nh th no, c ngh a l cch t. chc n nh th no, ni lu

    tr n v cch tm c ni dung ng n. C 2 v#n chnh trong thch thc

    ny:

    a. V#n th nh#t c lin quan n vic lu tr v t. chc c$a cc trang

    web ng ngh a (semantic web pages). Semantic web c s bao g1m

    cc trang c ch thch da trn ontology, c#u trc lin kt c$a cc

    trang ny phn nh c#u trc c$a WWW, c ngh a l cc trang lin kt

    vi nhng trang khc thng qua cc hyperlink. Theo cch lin kt ny

    (hyperlink) th khng khai thc c y $ ng ngh a c$a cc trang

    web ng ngh a. Chin lc semantic indexes c xu#t gom

    nhm ni dung c$a semantic web da trn cc ch$ c! th. Semantic

    indexes s* c pht sinh t ng b0ng cch s% d!ng thng tin c$a

    ontology v cc ti liu c ch thch.

    b. V#n th hai c lin quan n vic d- dng tm kim thng tin trn

    semantic web, ni cch khc l c lin quan n vic phi hp gia cc

    semantic indexes.

    2.2.3.4. Thch th'c 4: a ngn ng%

    Vic hc da trn s phn tn c$a ngn ng thng qua ni dung c$a WWW ch&

    ra r0ng thm ch nu ting Anh l ngn ng u th hn i vi cc ti liu, mt s ti

    nguyn c vit b0ng ngn ng khc c/ng r#t quan trng: Ting Anh 68,4%; Ting

    Nht 5,9%; Ting c 5,8%; Ting Trung Quc 3,9%; Ting Php 3,0%; Ting Ty

    Ban Nha 2,4%; Ting Nga 1,9%; Ting Italia 1,6%; Ting B1 o Nha 1,4%; Ting

    Hn 1,3%; Cc ngn ng khc 4,6% [www.vilaweb.com]. Tnh a dng c$a ngn ng

    cn quan trng hn nhiu i vi cc ti nguyn WWW. a ngn ng ng vai tr

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 40 - 0112398 T Th Ngc Thanh

    ngy cng ln i vi cc c#p sau: c#p ontology, c#p ch thch, v c#p

    giao din ngi dng.

    ( c#p ontology, nhng ngi thit k ontology c th mun s% d!ng ngn

    ng a phng c$a mnh cho vic pht trin ontology m trong cc ch thch s*

    c gn vo. Bi v khng phi t#t c ngi s% d!ng u l nhng ngi xy dng

    ontology, nn c#p ny c u tin th#p nh#t. S t1n ti cu a ngn ng v cc ti

    nguyn ngn ng hc, nh l WordNet [wordnet], EuroWordnet [eurowordnet],c

    th c xem xt t& m& h tr v#n a ngn ng c#p ny.

    ( c#p ch thch (annotation), ch thch c$a ni dung c th c thc hin

    trong nhiu ngn ng khc nhau. Bi v nhiu ngi dng ("c bit l cc nh cung

    c#p ni dung) s* thch ch thch ni dung hn l pht trin cc ontology, s h tr ph

    hp l cn thit phi cho cc nh cung c#p ( ni dung ) ch thch ni dung b0ng

    ngn ng a phng c$a h. c th pht sinh ni dung web ng ngh a b0ng t#t c

    kh nng, chng ta khng th yu cu ch thch ni dung t ting Php sang ting c

    c v ngc li.

    Cui cng c#p giao din ngi dng, hng t& ngi mun truy xu#t vo ni

    dung thch hp b0ng ngn ng a phng c$a h b#t ch#p ngn ng ngu1n ngn

    ng m trong cc ch thch c trnh by. M"c d hin ti, a s ni dung u

    c vit b0ng ting Anh, chng ta hy vng r0ng s* c nhiu ni dung hn c vit

    b0ng nhiu ngn ng khc. B#t k3 hng tip cn no c$a semantic web c/ng nn bao

    g1m cc tin ch truy xu#t thng tin trong nhiu ngn ng. Cc cng ngh quc t ho

    v a phng ho nn c xem xt c5n thn i vi vic truy xu#t thng tin c nhn

    da trn ngn ng a phng c$a ngi dng.

    2.2.3.5. Thch th'c 5: Visualization s# m7ng tng

    Vi s gia tng thng tin vt bc, s mng tng (hnh dung) c$a trc gic

    v thng tin s* tr nn r#t quan trng, bi v ngi dng s* yu cu s d- dng

    nhn bit s ph hp c$a ni dung cho m!c ch c$a h ngy cng gia tng. Thm vo

    vic s% d!ng semantic indexes v cc routers cho vic lu tr, t. chc v tm kim

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 41 - 0112398 T Th Ngc Thanh

    thng tin, v sau ny s* yu cu mt bc quan trng trong s mng tng. Cc

    cng ngh nn cho php i vi cc cng ngh 3 chiu v s mng tng mi

    mng tng ra ni dung c$a semantic web trong b#t k3 mt ngn ng web hin ti

    no (RDFS, OIL, DAML + OIL). Thng qua cng ngh hin th 1 ho thi gian thc

    3D tho ng v vic khai thc cc mi quan h ng ngh a, mt giao din ba chiu

    mi c th c pht sinh mt cch t ng. Theo cch ny, nhiu thng tin hn c

    th c trnh by trong mt khng gian nh2 hn, v ngi dng c th tng tc vi

    cc site mt cch thc t v tin li [Van Harmelen et al 2001].

    2.2.3.6. Thch th'c 6: S# chu,n ho cc ngn ng% semantic web

    Semantic web l mt l nh vc ang n.i bt v WWW Consortium s* a ra cc

    gii thiu v cc ngn ng v cng ngh s* c s% d!ng. vn ln n mc ngh

    thut trong semantic web, v cc cng c! phn ln ph! thuc vo ngn ng semantic

    web m chng c h tr, th nhu cu chu5n ho ngn ng semantic web l mt i

    h2i cn thit.

    2.2.4. So snh web v web ng% ngh!a

    im ging nhau gia Web v Web ng ngh a: c 2 u dng nhng lin kt

    (link) URI, nhng Web ng ngh a s% d!ng cc link ny r#t nhiu, vic s% d!ng link

    lm gia tng tnh chnh xc c$a thng tin.

    S khc nhau c bn gia Web v Web ng ngh a:

    Web ng% ngh!a Web

    Web ng ngh a l mt khng gian

    thng tin trong thng tin c biu

    di-n thng qua mt ngn ng m my

    v ngi u c th hiu c.

    Web l mt khng gian thng tin cha

    ng thng tin ch& hng vo vic biu

    di-n trong mt ngn ng t nhin m

    ch& c ngi mi hiu c.

    Web ng ngh a l mt d liu lin kt

    vi nhau mt cch ng ngh a v hnh

    thc.

    Web l mt tp hp thng tin lin kt

    vi nhau mt cch khng hnh thc.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 42 - 0112398 T Th Ngc Thanh

    2.2.5. Cc khi ni&m lin quan

    2.2.5.1. Metadata

    Metadata l thng tin c c#u trc m t, gii thch, nh v ho"c m"t khc

    lm cho d- dng truy v#n, s% d!ng, qun l mt ti nguyn thng tin. Metadata thng

    c gi l d liu v d liu (t in d liu), ho"c l thng tin v thng tin.

    Metadata l thng tin v thng tin, metadata c s% d!ng rng ri trong

    th gii thc cho m!c ch tm kim. V d!, bn mun mn mt vi quyn sch

    mt th vin no thng qua my tnh. Thng th th vin s* cung c#p mt h

    thng tra cu, h thng ny cho php bn lit k sch theo tn tc gi (author), theo

    ta sch (title), theo ch$ (subject), v.v. Danh sch lit k ny cha nhiu thng

    tin quan trng nh: tn tc gi, ta sch, ISBN, v thng tin quan trng nh#t l ni c#t

    gi sch. Bn cn vi thng tin (trong trng hp ny l ni c#t gi sch) m bn

    mun bit v bn s% d!ng metadata (trong trng hp ny l: tn tc gi, ta sch, v

    ch$ ) l#y c sch.

    C 3 kiu metadata:

    a. Descriptive metadata: m t mt ti nguyn cho nhng m!c ch nh l

    khm ph ho"c l nhn din. N c th bao g1m cc phn t% nh l:

    titles, astract, author, v keywords.

    b. Structural metadata: v d!: cho bit cc i tng phc hp lin kt vi

    nhau nh th no, cc trang (pages) c sp xp thnh cc chng nh

    th no.

    c. Administrative metadata: cung c#p thng tin gip cho vic qun l mt

    ti nguyn, nh l n c to ra khi no v nh th no, kiu file, v

    cc thng tin k thut khc, v nhng ai c th truy cp n n.

    2.2.5.2. Namespace

    Chng ta c th m rng tp t vng c$a chng ta thng qua cc

    namespace l cc nhm c$a tn cc phn t% v tn cc thuc tnh. Gi s%, nu bn

    mun gp (include) mt k hiu (symbol) c m ho trong mt ngn ng nh d#u

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 43 - 0112398 T Th Ngc Thanh

    no trong mt ti liu XML, th bn c th khai bo mt namespace ( khng gian

    tn) m symbol thuc v. Thm vo , chng ta c th trnh c tnh hung hai

    i tng XML trong cc khng gian tn khc nhau vi cng mt tn m c ngh a

    khc nhau thng qua cc "c trng c$a cc namespace. Gii php l gn mt tin t

    nhn bit namespace m mi phn t% ho"c cc thuc tnh thuc v. C php c$a

    namespace nh sau:

    ns-prefix:local-name

    Trong ns-prefix l tn c$a namespace, v local-name l tn c$a phn

    t% ho"c thuc tnh.

    V d! v namespace:

    Ti liu XML di y l mt th vin sch. Chng ta bt u b0ng phn

    t% gc c tn th, l , bn trong th, gc cha cc phn t% sch v ta

    sch nh sau:

    Earthquakes for lunch

    Khng gian tn cc b (local namespace):

    Chng ta c th "t thuc tnh xmlns phn t% gc hay b#t k3 th, no khc.

    Khi thuc tnh ny khng n0m trong th, gc th ta gi l khng gian tn c!c b.

    V d!: Xem on xml di y:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 44 - 0112398 T Th Ngc Thanh

    Earthquakes for lunch.

    Earthquakes for lunch.

    Trong v d! ny th namespace: xmlns:amazon=http://www.amazon.com.lib

    c gi l khng gian tn c!c b.

    2.2.6. Ontology

    Thut ng ontology c vay mn t trit hc. ngh a u tin c$a n l

    the branch of metaphysics that deals with the nature of being [The American

    Heritage Dictionary of the English Language: Fourth Edition (2000)].

    Ontology l mt cng ngh quan trng mang tnh ch#t xng sng, v n cung

    c#p mt "c tnh quan trng: ontology giao tip c gia ng ngh a hnh thc m

    my tnh c th hiu c vi ng ngh a c$a th gii thc m con ngi c th hiu

    c.

    Nhng Ontology c pht trin trong tr tu nhn to tri thc d- dng chia

    s, v s% d!ng li. K t u thp nin 90 c$a th k4 XX, Ontology tr thnh mt

    ti nghin cu ph. bin i vi cc t. chc nghin cu tr tu nhn to, bao g1m

    nhng k s v tri thc (Knowledge), x% l ngn ng t nhin v trnh by tri thc.

    Ontology khng ch& lm cho tri thc c th s% d!ng li d- dng hn, n cn l

    nn tng c$a vic to ra cc chu5n bi v n lm r cc khi nim bn cnh mt thut

    ng ho"c mt m hnh. Yu cu trn thc t khng phi ch& dnh cho mt khi nim

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 45 - 0112398 T Th Ngc Thanh

    duy nh#t, m l i vi mt s tng tc m h1 gia cc khi nim phc tp v chi tit

    ( c th c trnh by trong nhiu ngn ng khc nhau).

    Gn y, khi nim Ontology tr nn ph. bin hn nhiu trong cc l nh vc

    nh s tch hp thng minh, nhng h thng thng tin hp tc, ph!c h1i thng tin,

    giao dch thng mi in t%, v qun l tri thc. M!c ch c$a Ontology l hng

    n tri thc min, nn s pht trin c$a n thng l mt qu trnh x% l ko theo

    nhiu yu t khc.

    T lc ra i n nay, Ontology c r#t nhiu nh ngh a. Tuy nhin, "c

    im ct li c$a Ontology vn l: Mt ontology l mt s ch& nh tng minh, hnh

    thc v chia s v mt khi nim dng chung. Trong :

    Mt khi nim tham chiu n mt m hnh tru tng c$a mt

    vi hin tng no trong th gii thc m xc nh nhng khi

    nim c lin quan v hin tng .

    Tng minh l nhng khi nim v nhng rng buc trn n c

    s% d!ng mt cch r rng.

    Hnh thc tham chiu n cng vic m ontology phi thc hin

    my tnh c th hiu c.

    Chia s phn nh r0ng mt ontology gi tri thc 1ng nh#t, ngh a

    l n khng b hn ch bi mt c nhn hay mt nhm ring l,

    no.

    Hin nay c nhiu ontology ln nh: CYC, WordNet, .

    V d! v ontology:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 46 - 0112398 T Th Ngc Thanh

    Hnh 10: Mt Ontology n gin

    2.2.7. Rdf

    2.2.7.1 Khi ni&m :

    RDF l t vit tt c$a Resource Description Framework. RDF c c% bi

    W3C cho mt m hnh v ngn ng siu d liu (metadata) chu5n. RDF l mt b

    khung cho vic m t cc ti nguyn trn web.

    RDF cung c#p m hnh d liu v c php cc phn c lp nhau c th

    chuyn .i cho nhau v s% d!ng c RDF.

    2.2.7.2 C(u trc :

    RDF l khung sn (framework) cho vic x% l metadata, v n m t cc mi

    quan h gia cc ti nguyn thng qua cc thuc tnh v cc gi tr. RDF c xy

    dng da trn cc lut nh sau:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 47 - 0112398 T Th Ngc Thanh

    Resource: Mi th c m t b0ng biu thc RDF c gi l mt

    resource ( ti nguyn). Mi ti nguyn c mt URI v n c th l ton b trang web

    ho"c l mt phn c$a trang web.

    Property: Property l mt kha cnh, "c trng, thuc tnh ho"c quan h

    ring bit c dng m t mt ti nguyn trch trong W3C, Resource

    Description Framework (RDF) Model and Syntax Specification. Ch l mt

    property c/ng c th l mt resource bi v n c nhng tnh ch#t ring c$a n.

    Statements: Mt statements c dng kt hp mt resource, mt

    property v mt value c$a n. Ba phn ring bit ny c bit nh l subject,

    predicate, v object. V d!, The Author of

    http://www.cs.bris.ac.uk/home/pw2538/index.html is Peng Wang l mt statement.

    Ch r0ng value c$a cu ny c th l mt chui k t m c/ng c th l mt

    resource.

    V d v RDF:

    Mt statement ( pht biu ) c th c xem nh l mt 1 th trong RDF.

    Pht biu nh sau:

    The Author of http://www.cs.bris.ac.uk/home/pw2538/index.html is

    Peng Wang

    Cu trn c phn tch thnh 3 phn:

    Subject ( Resource ) http://www.cs.bris.ac.uk/home/pw2538/index.html

    Predicate (Property) Author

    Object (Literal) Peng Wang

    c biu di-n di dng 1 th nh sau:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 48 - 0112398 T Th Ngc Thanh

    Chiu c$a m/i tn lun hng t subject n object c$a pht biu ( statement).

    V 1 th c th c theo cch sau: HAS , v d!:

    http://www.cs.bris.ac.uk/home/pw2538/index.html has author Peng Wang.

    Nu chng ta gn mt URI cho thuc tnh author, th s* c :

    http://www.cs.bris.ac.uk/home/pw2538/terms/author

    trnh by ngn gn, chng ta a ra mt s tin t ( prefix) trnh phi

    vit li ton b a ch& URI tham chiu n. C mt s tin t gn lin vi cc URI

    c s% d!ng rng ri sau:

    Tin t rdf: l khng gian tn cho URI:

    http://www.w3.org/1999/02/22-rdf-syntax-ns#

    Tin t rdfs: l khng gian tn cho URI:

    http://www.w3.org/2000/01/rdf-schema#

    Tin t daml: l khng gian tn cho URI:

    http://www.daml.org/2001/03/daml+oil#

    Tin t xsd: l khng gian tn cho URI:

    http://www.w3.org/2001/XMLSchema#

    Trong v d! ny, chng ta dng khng gian tn l pwterms i din cho a

    ch& URI m ta tham chiu n: http://www.cs.bris.ac.uk/home/pw2538/terms

    Khi c php RDF cho cu pht biu: The Author of

    http://www.cs.bris.ac.uk/home/pw2538/index.html is Peng Wang l:

    1

    2

    3

    4

    5

    6

    7

    Peng Wang

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 49 - 0112398 T Th Ngc Thanh

    Mt cu pht biu khc: Mt ngi c m s sinh vin l pw2538 c tn l

    Peng Wang v c a ch& email l [email protected] . Ngi ny l tc gi c$a ti

    nguyn http://www.cs.bris.ac.uk/home/pw2538/index.html

    C 1 th nh sau:

    C c php RDF:

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 50 - 0112398 T Th Ngc Thanh

    M hnh d% li&u RDF (RDF Data Model):

    RDF cung c#p mt m hnh cho vic m t cc ti nguyn. Ti nguyn c cc

    tnh ch#t (property) thuc tnh ho"c l "c trng. RDF nh ngh a ti nguyn nh l

    mt i tng b#t k3 c th nhn bit duy nh#t b0ng mt URI. Cc property c kt

    hp vi cc ti nguyn c nhn bit bi cc property types, v cc property

    types ny c cc values tng ng. Property types m t mi quan h c$a cc values

    c kt hp vi cc ti nguyn. Trong RDF, cc values c th c xem nh l

    nguyn t% trong t nhin ( chui text, s, v.v) ho"c l cc loi ti nguyn khc.

    Bn ch#t ct li c$a RDF l mt m hnh c lp c php cho vic trnh by cc

    ti nguyn v s m t tng ng c$a chng.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 51 - 0112398 T Th Ngc Thanh

    Hnh 11: M hnh d% li&u RDF

    M hnh d liu RDF l mt 1 th c gn nhn nh hng, trong cc nt l cc ti

    nguyn (nhng thc th vi URI) ho"c nhng k t, v cc cnh l nhng thuc tnh. Nh

    gii thiu, mt pht biu RDF l mt b ba (Ch$ ng, V ng, B. ng). Trong , ti nguyn

    l Ch$ ng c$a mt pht biu c thuc tnh m gi tr c$a n l B. ng c$a mt pht biu.

    Mt B. ng c th l ti nguyn ho"c c th l mt gi tr k t. Mt pht biu c th c

    i din nh mt 1 th, b0ng cch v* mt cung t mt nt (Ch$ ng) n nt khc (B. ng).

    RDF l mt cch thnh lp cho vic x% l siu d liu, n cung c#p

    interoperability (thao tc gia cc phn) gia cc ng d!ng m chuyn .i thng tin

    my c th hiu c trn web. RDF nh#n mnh cc tin ch c th x% l t ng

    cc ti nguyn web.

    2.2.7.3 RDF Schema mt ngn ng% m t t" v#ng

    Ngn ng c nh ngh a trong "c t ny (specification) g1m mt tp hp

    cc ti nguyn m c th c s% d!ng m t cc thuc tnh c$a cc ti nguyn

    RDF khc ( bao g1m c cc thuc tnh) nh ngh a tp t vng RDF c$a ng d!ng

    xc nh. Tp t vng ny ch$ yu c nh ngh a trong mt khng gian tn c

    gi l rdfs, v c nhn bit bi tham chiu URI: http://www.w3.org/2000/01/rdf-

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 52 - 0112398 T Th Ngc Thanh

    schema#. "c t ny c/ng s% d!ng tin t rdf tham chiu n khng

    gian tn RDF chnh: http://www.w3.org/1999/02/22-rdf-syntax-ns#.

    H thng class v property trong RDF Schema c/ng tng t nh cc h thng

    kiu c$a cc ngn ng hng i tng nh Java. Tuy nhin, RDF khc vi cc h

    thng khc ch thay v nh ngh a mt class trong quan h c$a cc thuc tnh m th

    hin c$a n c th c, RDF Schema s* nh ngh a cc thuc tnh trong quan h c$a cc

    lp c$a ti nguyn m chng ng d!ng. y l nhim v! c$a rdfs:domain v

    rdfs:range c m t trong "c t ny. V d!, chng ta c th nh ngh a thuc tnh

    eg:author, c min l eg:Document v gii hn l eg:Person, nhng tri li mt h

    thng hng i tng kinh in c th nh ngh a mt cch "c trng mt class

    eg:Book vi mt thuc tnh c gi l eg:author c$a kiu eg:Person.

    T" v#ng Domain and Range

    "c t ny gii thiu tp t vng RDF cho vic m t cch s% d!ng y $ ng

    ngh a c$a cc property v cc class trong d liu RDF. V d!, mt lc 1 RDF c th

    m t gii hn trn cc kiu c$a cc value thch hp vi mt s thuc tnh.

    RDF Schema cung c#p c ch (k thut) cho vic m t thng tin ny, nhng khng

    th ni trong trng hp no th ng d!ng nn s% d!ng n v s% d!ng nh th no.

    Cc ng d!ng khc nhau s* s% d!ng thng tin ny theo nhiu cch khc nhau. V d!,

    cc cng c! kim tra d liu c th s% d!ng thng tin ny tm ra cc li trong

    dataset, mt trnh son tho giao tip gia ngi v my c th ngh nhng gi tr

    thch hp, v mt ng d!ng suy lun c th s% d!ng n suy lun r1i a ra thng tin

    mi t d liu ban u.

    Lc 1 RDF (RDF Schema) c th m t cc mi quan h gia cc t vng t

    nhiu lc 1 c pht trin c lp nhau. Bi v tham chiu URI c s% d!ng

    nhn bit cc class v property trn web, nn n c th to ra cc thuc tnh (property)

    mi c domain v range m gi tr c$a n c nh ngh a trong mt namespace khc.

    "c t ny khng c gng lit k t#t c cc hnh thc c th c c$a vic m

    t t vng m n c s% d!ng trnh by ng ngh a c$a cc class v property c$a

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 53 - 0112398 T Th Ngc Thanh

    RDF. Thay vo , chin lc m t t vng RDF tha nhn r0ng c nhiu k thut

    m thng qua ng ngh a c$a cc class v property c cho bit, v xu#t bn

    mt s quy c cho vic s% d!ng RDF/XML m t cc "c trng c$a cc class v

    property cu RDF.

    Lc 1 tt hn ho"c l cc ngn ng ontology nh l DAML+OIL, W3C,

    cc ngn ng suy lun da trn lut, v cc ch$ ngh a hnh thc khc, mi loi s* gp

    phn cho kh nng c$a chng ta nm bt c s t.ng hp y $ ng ngh a v d

    liu trn web. Cc nh thit k t vng RDF c th to v pht trin cc ng d!ng web

    ng ngh a b0ng cch s% d!ng tin ch The basic RDF Schema 1.0, trong khi trnh by

    cc ngn ng m t t vng tt hn cch ny c/ng s% d!ng hng tip cn ny.

    S lc v RDF Schema

    Bng ny trnh by mt cch t.ng qut v tp t vng c s c$a RDF

    Tn lp Ghi ch

    rdfs:Resource The class resource, everything.

    rdfs:Literal This represents the set of atomic values, eg.

    textual strings.

    rdfs:XMLLiteral The class of XML literals.

    rdfs:Class The concept of Class

    rdf:Property The concept of a property.

    rdfs:Datatype The class of datatypes.

    rdf:Statement The class of RDF statements.

    rdf:Bag An unordered collection.

    rdf:Seq An ordered collection.

    rdf:Alt A collection of alternatives.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 54 - 0112398 T Th Ngc Thanh

    rdfs:Container This represents the set Containers.

    rdfs:ContainerMembershipProperty

    The container membership properties, rdf:1,

    rdf:2, ..., all of which are sub-properties of

    'member'.

    rdf:List The class of RDF Lists

    Bng 3 : Cc lp trong RDF

    Property name comment domain range

    rdf:type Indicates membership of a class rdfs:Resource rdfs:Class

    rdfs:subClassOf Indicates membership of a class rdfs:Class rdfs:Class

    rdfs:subPropertyOf Indicates specialization of

    properties rdf:Property

    rdf:Propert

    y

    rdfs:domain A domain class for a property type rdf:Property rdfs:Class

    rdfs:range A range class for a property type rdf:Property rdfs:Class

    rdfs:label Provides a human-readable

    version of a resource name. rdfs:Resource rdfs:Literal

    rdfs:comment Use this for descriptions rdfs:Resource rdfs:Literal

    rdfs:member a member of a container rdfs:Container not

    specified

    rdf:first The first item in an RDF list. Also

    often called the head. rdf:List

    not

    specified

    rdf:rest

    The rest of an RDF list after the

    first item. Also often called the

    tail.

    rdf:List rdf:List

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 55 - 0112398 T Th Ngc Thanh

    rdfs:seeAlso

    A resource that provides

    information about the subject

    resource

    rdfs:Resource rdfs:Resour

    ce

    rdfs:isDefinedBy Indicates the namespace of a

    resource rdfs:Resource

    rdfs:Resour

    ce

    rdf:value

    Identifies the principal value

    (usually a string) of a property

    when the property value is a

    structured resource

    rdfs:Resource not

    specified

    rdf:subject The subject of an RDF statement. rdf:Statement rdfs:Resour

    ce

    rdf:predicate the predicate of an RDF statement. rdf:Statement rdf:Propert

    y

    rdf:object The object of an RDF statement. rdf:Statement not

    specified

    Bng 4:Cc thuc tnh c$a RDF

    (M t cc t vng c$a RDF c trnh by trong phn Ph! l!c [1].)

    2.3. eDoc

    2.3.1. Tm hi6u eLearning

    2.3.1.1. Khi ni&m

    eLearning hay cn gi l Online Learning, chu5n cho t#t c cc hnh thc c$a

    vic hc.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 56 - 0112398 T Th Ngc Thanh

    Online learning lin quan n vic s% d!ng cc cng ngh mng ( nh l:

    Internet hay l mng thng mi bussiness network) cho vic phn pht, h tr,

    nh gi vic dy hc chnh qui v khng chnh qui.

    Hc xy ra u v nh th no? (: cc ti nguyn v cc ti liu trc tuyn,

    cc th vin in t%, cc ti liu; v cc kho hc, cc bu.i tho lun, chats, email, hi

    ngh, v cc ng d!ng chia s, tri thc. Mt ch quan trng l online learning khng

    nh#t thit phi di-n ra trc tuyn (online). S% d!ng cng ngh cho vic hc thng l

    mt yu t ph! i vi lp hc v cc c hi hc trc tip ( face to face ).

    Mt s nguyn nhn s% d!ng online learning:

    a. Vic truy cp c ci thin v tnh linh ng: Mi ngi c th ng

    nhp vo b#t k3 mt my tnh no, ti nh ho"c ni lm vic, vo b#t

    k3 lc no k c ngy ln m, l#y bi hc ho"c tham kho n cc

    ti liu hc.

    b. Phn phi nhanh hn v tit kim chi ph: i vi cc t. chc cn truyn

    t thng tin quan trng m thng tin ny nhanh chng tr nn li thi (

    v d!, phin bn mi nh#t c$a mt sn ph5m), th hnh thc online hu

    nh l r, hn v nhanh hn nhiu so vi vic ngi truyn t phi bay

    qua nhiu quc gia g"p g nhng hc vin lp hc vi hng ting

    1ng h1.

    c. Ci tin vic iu hnh v chu5n ho: Trong mi trng thng mi

    quc t ngy nay, nhiu t. chc m rng trn phm vi ton cu. S khc

    nhau v kin thc v k nng c$a cc c nhn dy c th s* lm cho ch#t

    lng hc c$a cc hc vin nhng ni khc nhau s* khc nhau: v d!

    nhng ngi hc New Delphi s* c ch#t lng hu#n luyn khc vi

    nhng ngi New York. Online learning cung c#p thng tin nh#t qun,

    ph. bin i vi cc i tng khp ni.

    Lm n.i bt thng tin truyn t v s cng tc: Thng qua nhng phn mm

    no s* cho php nhng ngi hc c giao tip vi nhau, cng tc vi nhau qua

    cc d n, v chia s, ti liu m khng cn phi g"p m"t trc tip.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 57 - 0112398 T Th Ngc Thanh

    2.3.1.2. Cc chu,n c a eLearning

    Ngnh cng nghip eLearning tip t!c c m rng mi ngy, v cc chu5n

    cn thit to ni dung bi hc ngy cng tr nn phc tp.

    Trc khi mt qui c c$a eLearning tr thnh standards (chu5n), n c

    gi l specification ( "c t ). Specification c duyt bi mt t. chc t. chc

    ny c mi ngi cng nhn, nh l IEEE ch+ng hn.

    Mt s chu5n c$a eLearning:

    a. Tp phn t siu d liu Dublin Core

    Tp phn t% siu d liu Dublin Core ( The Dublin Core metada element

    set) l chu5n cho s m t ti nguyn thng tin xuyn domain (bng qua nhiu

    domain). ( y, ti nguyn thng tin c nh ngh a l b#t k3 th g m c th

    nhn bit c. i vi cc ng d!ng Dublin Core, mt ti nguyn s* l mt ti

    liu in t% (electronic document).

    Siu d liu Dublin Core c dng cho vic tm kim v ch& m!c cho

    cc siu d liu da trn Web. Tp siu d liu ny cung c#p t vng ng ngh a

    nh: Description, Creator v Date cho vic m t nhng "c trng thng

    tin quan trng c$a cc ti nguyn Internet.

    Tp siu d liu Dublin Core cung c#p 15 t vng:

    Title: Tn c gn cho ti nguyn.

    Creator: Thc th c trch nhim to ra ti nguyn. V d! nh:

    c nhn, t. chc hay mt dch v! no .

    Subject: Ch$ ni dung c$a ti nguyn.

    Description: M t ni dung c$a ti nguyn.

    Publisher: Thc th c nhim v! to ra ti nguyn.

    Contributor: Thc th c ng gp vo ni dung c$a ti nguyn.

    Date: Ngy ti nguyn c to.

    Type: Th loi ni dung c$a ti nguyn.

    Format: Dng lu tr vt l c$a ti nguyn.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 58 - 0112398 T Th Ngc Thanh

    Identifier: Mt tham chiu c! th n ti nguyn trong mt ng

    cnh cho php.

    Source: Tham chiu n mt ti nguyn m ti nguyn c

    dn xu#t.

    Language: Ngn ng s% d!ng bi ni dung c$a ti nguyn.

    Relation: Tham chiu n mt ti nguyn lin quan

    Coverage: M rng ni dung c$a ti nguyn

    Right: Thng tin v quyn s hu ti nguyn.

    b. LOM (Learning Object Metadata)

    LOM l mt chu5n v eLearning hin ti c pht trin bi t. chc

    IEEE. T. chc chu5n ho cng ngh hc (Learning Technology Standards

    Committee) c$a IEEE pht trin chu5n LOM nh0m gip cho vic s% d!ng v

    s% d!ng li c$a cc ti nguyn hc c h tr cng ngh nh l vic hu#n

    luyn da trn my tnh, v vic hc t xa.

    Trong mt h thng eLearning, i tng hc l nhng g c th c s%

    d!ng, k tha hay tham kho trong vic h tr cng ngh hc. Hin ti mt s

    i tng ang c tip t!c pht trin nh0m p ng nhu cu hc thay .i

    nhanh chng. Vic thiu thng tin hay siu d liu v i tng hc to ra

    nhiu cn tr, hn ch cho kh nng qun l, khm ph v s% d!ng i tng

    hc.

    LOM gii quyt v#n trn b0ng cch nh ngh a mt c#u trc cho vic

    m t mt i tng hc. LOM ch& ra c php v ng ngh a c$a cc siu d

    liu i tng hc, nh ngh a cc thuc tnh nh0m m t y $ v tho ng

    cc i tng hc.

    M!c ch c$a LOM:

    Cho php ngi hc hay ngi hng dn tm kim, nh gi i

    tng hc.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 59 - 0112398 T Th Ngc Thanh

    Cho php chia s, v trao .i cc i tng hc qua b#t k3 cng ngh

    c h tr h thng hc.

    Cho php pht trin cc i tng hc theo cc n v c kh nng

    kt hp hay phn r theo mt phng php ph hp.

    Cho php cc agent my tnh linh ng l t ng trong vic t. chc

    cc bi hc cung c#p n ngi hc.

    N hon ton da trn chu5n v quan tm n cc i tng hc

    trong mi trng m v phn tn.

    Cho php cc cng ngh mi kt hp vi cc i tng hc.

    Cung c#p cho cc nh nghin cu chu5n h tr v su tp d liu lin

    quan n hiu qu c$a cc i tng hc.

    LOM nh ngh a mt tp ti thiu cc thuc tnh (attributes) qun l,

    nh v, v nh gi cc i tng hc. Cc thuc tnh c gom nhm thnh 8

    phm tr:

    General: cha ng thng tin v ton b i tng.

    Lifecycle: cha ng siu d liu v s tin ho c$a cc i

    tng.

    Technical: vi s m t c$a cc "c trng v yu cu k thut.

    Educational: cha ng cc thuc tnh v gio d!c ho"c s phm.

    Rights: m t quyn s hu v cc iu kin s% d!ng

    Relation: nhn bit cc i tng c lin quan vi nhau.

    Annotation: cha ng cc ch thch v ngy, tc gi c$a cc ch

    thch ny.

    Classification: nhn bit cc b nhn din h thng phn loi

    khc cho i tng.

    Bn trong mi phm tr l mt tp cc phn t% d liu c th t, m gi

    tr c$a chng l cc metadata. V d!: Cc phn t% siu d liu lin quan n

    vic hc c tm th#y trong phm tr Education l Typical Age Range,

    Difficulty, Typical Learning Time, v Interactivity Level.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 60 - 0112398 T Th Ngc Thanh

    c. vCard

    vCard l chu5n c gii thiu v pht trin bi IMC (Internet Mail

    Consortium). Cc thng tin c nhn thng thng r#t phc tp v c nhiu loi

    khc nhau. Hin ti c mt s chu5n xu#t cc c#u trc cho vic trao .i

    thng tin c nhn PDI (Personal Data Interchange). M!c ch c$a chu5n ny l

    nh0m gii quyt nhu cu su tp v trao .i thng tin c nhn qua nhiu knh

    thng tin khc nhau nh in thoi, th in t% hay i thoi trc tip.

    Chu5n vCard ph hp cho vic trao .i d liu c nhn gia cc ng

    d!ng v h thng. nh dng c$a vCard hon ton c lp vi phng php

    dng truyn ti n. Vic truyn ti ny c th l trao .i mt h thng tp

    tin, mng chuyn mch cng cng, mng dy dn hay mng khng dy. vCard

    nhm n vic trao .i thng tin c nhn. Trong mi trng thng mi ngy

    nay, thng tin ny thng c trao .i trn cc th, thng mi v vCard nh

    ngh a nhng thng tin ny da trn cc i tng th, thng mi in t%.

    d. SCORM (Shareable Content Object Reference Model)

    SCORM nh ngh a m hnh kt hp gia ni dung v mi trng thc

    thi cho cc i tng hc. y l mt m hnh tham chiu n mt tp cc k

    thut lin quan vic thit k nh0m p ng yu cu ni dung hc da trn Web,

    nhng yu cu ny bao g1m kh nng ti s% d!ng, truy xu#t, kh nng tng

    tc c$a cc i tng hc.

    e. IMS ( Instructional Management Systems)

    IMS ang c pht trin v xc tin tr thnh chu5n m cho cc hot

    ng eLearning nh s% d!ng, sp xp cc ni dung gio d!c v m rng cc

    khi nim t.ng qut nh: thit k ngi hc, theo di v bo co qu trnh

    ngi hc nh0m thc hin vic trao .i thng tin gia cc h thng hc khc

    nhau.

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 61 - 0112398 T Th Ngc Thanh

    M!c ch c$a IMS:

    nh ngh a cc chu5n k thut nh0m nng cao kh nng tng tc

    gia ng d!ng v dch v! trong mi trng hc phn tn hin nay.

    H tr vic st nhp "c t c$a IMS vo trong cc sn ph5m v dch

    v! trn ton th gii. S ch#p nhn "c t rng ri s* cho php phn

    phi mi trng v ni dung hc t nhiu tc gi li vi nhau.

    2.3.2. Tm hi6u eLib

    Elib (electronic library hay c gi l digital library) l mt th vin 5n. T

    electronic library ng! l mt su tp c$a cc ti nguyn thng tin in t% c ni

    mng cng k thut lin kt v c s h tng qun tr. Bn c th truy cp n t b#t c

    my PC hay laptop c ni mng no t b#t c ni no trn th gii b#t c thi im

    no.

    Elib lu tr v ch& m!c hng vn sch, bo, tp ch v $ cc ch$ trn th

    gii, ch+ng hn nh vt l, thin vn, sinh ho, cng ngh sinh hc, ho hc v cng

    trnh xy dng ho ch#t, cc thit b xy dng, cng trnh xy dng mi trng, khoa

    hc thc ph5m, v an ton sc kho, v v sinh .v.v c/ng nh cc ti liu v thng

    tin tiu s%, l lch c nhn, ngh nghip, cc t. chc, hi lin hip, v du lch v.v.

    Th vin in t% ny c s% d!ng ph. bin nh#t trong cc trng i hc v nhng

    trung tm nghin cu khoa hc. T#t nhin, i tng s% d!ng n chnh l nhng sinh

    vin, nghin cu sinh v cc nh khoa hc.

    Nhng chng trnh Electronic library c xy dng da trn nhng chu5n

    thng nh#t do cc hi 1ng, t. chc ln trn th gii lp ra. Mt s t. chc nh chu5n

    ln trn gii nh W3C (World Wide Web Consortium), ISO (International

    Organization for Standardization), NISO (National Information Standards

    Organization ), . C nhiu chu5n cho nhiu kha cnh khc nhau c$a vic lu tr v

    truy cp thng tin in t%, bao g1m cc chu5n v thu h1i thng tin (Information

    Retrieval Standard), thao tc gia cc phn (Interoperability), nh dng ti nguyn,

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 62 - 0112398 T Th Ngc Thanh

    nhn dng ti nguyn, m t ti nguyn, Sau y l mt s chu5n s% d!ng trong

    eLib lin quan n v#n truy cp thng tin in t%:

    Chu,n v thu h.i thng tin:

    Kiu chu5n ny cho php thng tin gia cc h thng khc nhau, lm cho thun

    tin trong vic khm ph v truy cp thng tin in t%. V d! nh chu5n thu h1i thng

    tin ISO 23950 (tng ng vi ANSI Z39.50) nh ngh a mt hng chu5n cho hai

    my tnh lin lc v chia s, thng tin vi nhau. N c thit k h tr khm

    ph ti nguyn v thu h1i ti nguyn c$a nhng ti liu full-text, d liu m!c l!c,

    cc hnh nh v multimedia. Chu5n ny da trn kin trc client-server v c lp vi

    cc h thng c! th, hon ton iu hnh trn Internet.

    Z39.50:

    Z39.50 l mt trong mt nhm cc chu5n c sn xu#t lm cho d- dng kt

    ni cc h thng my tnh. Chu5n ny ch& ra cc nh dng v th$ t!c chi phi vic

    trao .i cc thng ip gia client v server, cho php ngi dng c th tm kim cc

    c s d liu t xa, nhn din cc dng d liu c nh r cc chu5n, v thu h1i mt

    vi hay t#t c cc dng c nhn din v c lin quan, c! th vi vic tm kim v

    thu h1i thng tin trong c s d liu. Mt trong nhng thun li ln trong vic s% d!ng

    Z39.50 l n cho php truy cp nh nhau n mt s lng ln ngu1n thng tin thay

    .i khc nhau.

    Z39.50 tha nhn r0ng vic thu h1i thng tin g1m hai thnh phn chnh chn

    thng tin da trn nhng tiu chu5n v thu h1i thng tin , v n cung c#p mt ngn

    ng chung cho c hai hnh ng . Z39.50 chu5n ho cch x% s m trong client

    v server thng tin vi nhau v hot ng ngay khi c nhng khc bit gia cc h

    thng my tnh, cc cng c! tm kim v cc c s d liu.

    EDI (Electronic Data Interchange)

    EDI c bit n nh mt chu5n cng ngh thng tin quc gia. ( EDI, d liu

    m theo truyn thng c chuyn vo trong cc ti liu gi#y th c truyn hay

    c thng tin mt cch in t% ty vo cc lut v cc nh dng c thit lp. D

  • ti: Tm kim ng ngh a ng d!ng trn l nh vc eDoc

    0112274 Phm Th M Phng - 63 - 0112398 T Th Ngc Thanh

    liu lin i vi mi kiu c$a ti liu chc nng, v d! nh bng mua bn hay ho n,

    c vn chuyn ln nhau nh l mt thng ip in t%. D liu nh dng c th

    c vn chuyn t ngi to ra n ngi nhn thng qua thng tin lin lc b0ng cp

    hay vn chuyn vt l vo trong thit b lu tr in t%.

    EDI a n mt chui cc thng ip gia hai ni, v d! ngi mua v ngi

    bn, mi ngi c th xem nh l ngi to ra hay ngi nhn. Cc thng ip t

    ngi mua n ngi bn s* bao g1m, v d! nh d liu cn thit cho yu cu i vi

    s trch dn (request for quotation_ RFQ), cc bin lai mua bn, cc thng bo vic

    vn chuyn tu thuyn, v cc ho n. Vic thc thi c$a EDI yu