59
數數數數數 – XML 數數數數 Jian-hua Yeh ( 葉葉葉 ) 葉葉葉葉葉葉葉葉葉葉葉葉葉 [email protected]

數位圖書館 – XML 系統應用

  • Upload
    enoch

  • View
    46

  • Download
    5

Embed Size (px)

DESCRIPTION

數位圖書館 – XML 系統應用. Jian-hua Yeh ( 葉建華 ) 真理大學資訊科學系助理教授 [email protected]. Outline. XML language introduction XML server architecture XML query language design issues. XML Introduction. What is XML? Why XML? The XML power XML and the enterprise. What is XML?. - PowerPoint PPT Presentation

Citation preview

Page 1: 數位圖書館  – XML 系統應用

數位圖書館 – XML 系統應用

Jian-hua Yeh (葉建華 )

真理大學資訊科學系助理教授[email protected]

Page 2: 數位圖書館  – XML 系統應用

2

Outline

• XML language introduction

• XML server architecture

• XML query language design issues

Page 3: 數位圖書館  – XML 系統應用

3

XML Introduction

• What is XML?

• Why XML?

• The XML power

• XML and the enterprise

Page 4: 數位圖書館  – XML 系統應用

4

What is XML?

• Proposed by W3C at the end of 1996

• SGML-derived

• A meta-language for new tagging language

• XML1.0 Recommendation released at Feb. 1998

• Supporting

– Sun, Microsoft, Netscape, Adobe, ArborText, etc.

Page 5: 數位圖書館  – XML 系統應用

5

What is XML? (2)

• eXtensible Markup Language

• Tag-based

• Open and cross-platform

• Structural data representation

• As data and as document

• Suitable for data exchange

Page 6: 數位圖書館  – XML 系統應用

6

<?xml version="1.0"?> <invoicecollection> <invoice> <customer> Wile E. Coyote, Death Valley, CA </customer> <annotation> Customer asked that we guarantee return rights if these items should fail in desert conditions. This was approved by Marty Melliore, general manager. </annotation> <entries n=2> <entry quantity=2 total_price="134.00"> <product maker="ACME" prod_name="screwdriver" price="80.00"/> </entry> <entry quantity=1 total_price="20.00"> <product maker="ACME" prod_name="power wrench" price="20.00"/> </entry> </entries> </invoice> <invoice> <customer> Camp Mertz </customer> <entries n=2> <entry quantity=2 total_price="32.00"> <product maker="BSA" prod_name="left-handed smoke shifter" price="16.00"/> </entry> <entry quantity=1 total_price="13.00"> <product maker="BSA" prod_name="snipe call" price="13.00"/> </entry> </entries> </invoice> </invoicecollection>

Page 7: 數位圖書館  – XML 系統應用

7

Why XML?

• HTML is not enough, no structural data handling capability

• Recommended by W3C, an open standard

• The push of enterprise integration

• To break the stovepipe system, from vertical to horizontal

• The need of B2B, B2C integration

• Platform independent

Page 8: 數位圖書館  – XML 系統應用

8

Traditional Data Exchange Handling

• Private protocol for stovepipe system

• Open standard for data exchange

– RPC

– RMI

– CORBA

– COM

Page 9: 數位圖書館  – XML 系統應用

9

New Strategy of Data Exchange

• Text-based

• Tag-oriented

• Self-descriptive

• Data Type Definition

Page 10: 數位圖書館  – XML 系統應用

10

XML Details

• Components

– DTD

– XML content

• Processing models

– Event driven model: SAX

• A document is treated as a set of events

– Structural model: DOM

• A document is represented as a tree structure

Page 11: 數位圖書館  – XML 系統應用

11

XML Server Introduction

• Why XML server?

– Comply with enterprise service model: client/middle/EIS structure

• Common components can consists of 3rd party software vendors

– XML parser, XSL processor, etc.

Page 12: 數位圖書館  – XML 系統應用

12

XML Server Architecture

Page 13: 數位圖書館  – XML 系統應用

13

XML Server Architecture (2)

• Key aspects

– Client

• PDA, browser, Web server, other XML server, etc.

– Communication protocol

• Email, HTTP, FTP, EJB, RMI, IIOP, COM, etc.

– Key services

– Data object

• Relational database, object data source, etc.

Page 14: 數位圖書館  – XML 系統應用

14

XML Server Components

• Client

• Communication service

• Document handler

• Data object access module

• XML core service

Page 15: 數位圖書館  – XML 系統應用

15

An Operation Example

Page 16: 數位圖書館  – XML 系統應用

16

XML support in Java technology

• XML processing

• Data binding

• Remote communication

• Service registry

• Messaging

Page 17: 數位圖書館  – XML 系統應用

17

Java for XML Processing

• JAXP (Java API for XML Processing)

– SAX (Simple API for XML) parser

• Event-based XML parsing

– DOM (Document Object Model) parser

• Model-based XML parsing

– XSLT (XML Stylesheet Language for Transformations) processor

• Support SAX, DOM, stream-specific processing

Page 18: 數位圖書館  – XML 系統應用

18

Java for XML Data Binding

• JAXB (Java Architecture for XML Binding)

– Schema-based

– Validation

– Representing XML content

Page 19: 數位圖書館  – XML 系統應用

19

Java for XML Communication

• JAX-RPC (Java API for XML-based RPC)

– RPC-based Web service

– SOAP-based (Simple Object Access Protocol)

– Discoverable by using JAXR (*later*)

Page 20: 數位圖書館  – XML 系統應用

20

Java for XML Registries

• JAXR (Java API for XML Registries)

– Service registration

– Service lookup

Page 21: 數位圖書館  – XML 系統應用

21

Java for XML Messaging

• JAXM (Java API for XML Messaging)

– Message provider

• SAAJ (SOAP with Attachments API for Java)

– Message population with attachment

Page 22: 數位圖書館  – XML 系統應用

22

XML Processing, How?

• Locating: XPath

• Querying: XQL, XQuery

• Storage: XMLDB

Page 23: 數位圖書館  – XML 系統應用

23

What is XPath?

• W3C standard

• A syntax for defining parts of an XML document

• Uses paths to define XML elements

• Defines a library of standard functions

• A major element in XSLT

Page 24: 數位圖書館  – XML 系統應用

24

Sample XML

<?xml version="1.0" encoding="ISO-8859-1"?><catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd></catalog>

• Path

– /catalog/cd/price

• Function

– /catalog/cd[price>10.80]

Page 25: 數位圖書館  – XML 系統應用

25

XPath: The Syntax

Page 26: 數位圖書館  – XML 系統應用

26

Path Syntax: Locating Nodes

• /catalog/cd/price

• //cd

• /catalog/cd/*

• /catalog/*/price

• /*/*/price

• //*

<?xml version="1.0" encoding="ISO-8859-1"?><catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd></catalog>

Page 27: 數位圖書館  – XML 系統應用

27

Path Syntax: Selecting Branches

• /catalog/cd[1]

• /catalog/cd[last()]

• /catalog/cd[price]

• /catalog/cd[price=10.90]

• /catalog/cd[price=10.90]/

price

<?xml version="1.0" encoding="ISO-8859-1"?><catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd></catalog>

Page 28: 數位圖書館  – XML 系統應用

28

Path Syntax: Selecting Several Paths• /catalog/cd/title |

/catalog/cd/artist

• //title | //artist

• //title | //artist | //price

<?xml version="1.0" encoding="ISO-8859-1"?><catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd></catalog>

Page 29: 數位圖書館  – XML 系統應用

29

Path Syntax: Selecting Attributes• //@country

• //cd[@country]

• //cd[@*]

• //cd[@country='UK']

<?xml version="1.0" encoding="ISO-8859-1"?><catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <price>9.90</price> </cd></catalog>

Page 30: 數位圖書館  – XML 系統應用

30

XPath: Location paths

Page 31: 數位圖書館  – XML 系統應用

31

Formal Syntax

• axisname::nodetest[predicate]

– child::price[price=9.90]

Page 32: 數位圖書館  – XML 系統應用

32

Axes and Node Tests

Page 33: 數位圖書館  – XML 系統應用

33

Abbreviated Syntax

Page 34: 數位圖書館  – XML 系統應用

34

Location Paths Examples

Page 35: 數位圖書館  – XML 系統應用

35

XPath: The expressions

Page 36: 數位圖書館  – XML 系統應用

36

Expression Types

• Numerical expressions

• Equality expressions

• Relational expressions

• Boolean expressions

Page 37: 數位圖書館  – XML 系統應用

37

Numerical Expressions

Page 38: 數位圖書館  – XML 系統應用

38

Equality Expressions

Page 39: 數位圖書館  – XML 系統應用

39

Relational Expressions

Page 40: 數位圖書館  – XML 系統應用

40

Boolean Expressions

Page 41: 數位圖書館  – XML 系統應用

41

XPath: The functions

Page 42: 數位圖書館  – XML 系統應用

42

XPath Function Library

• Node Set Functions

• String Functions

• Number Functions

• Boolean Functions

Page 43: 數位圖書館  – XML 系統應用

43

Node Set Functions

Page 44: 數位圖書館  – XML 系統應用

44

String Functions

Page 45: 數位圖書館  – XML 系統應用

45

Number Functions

Page 46: 數位圖書館  – XML 系統應用

46

Boolean Functions

Page 47: 數位圖書館  – XML 系統應用

47

XQL: XML Query Language

• XQL problem domains

• Queries, search contexts, and result sets

• Result sets vs. result documents

Page 48: 數位圖書館  – XML 系統應用

48

XQL Introduction

• Developers

– Texcel, webMethods, Microsoft

• Traditional query processing

• Features of XML documents

Page 49: 數位圖書館  – XML 系統應用

49

Traditional Query Processing

• Structured query

– For relational database

– For object-oriented database

• Unstructured full-text query

– For text documents

Page 50: 數位圖書館  – XML 系統應用

50

Features of XML Documents

• As documents

• As data sources

• With structure feature

Page 51: 數位圖書館  – XML 系統應用

51

XQL Problem Domains

• Queries within a single document(in a browser or editor)

• Queries in collections of documents(document assembly in an XML repository)

• Addressing within or across documents

• XSL Patterns

Page 52: 數位圖書館  – XML 系統應用

52

The Role of a Query Language

• Different problem domain has different input/output and processing model

• Common thing

– assertion(name,content,value,relationship)

• Tradeoff

– design separate query language for each problem domain

– a general-purpose query language for all problem domains

Page 53: 數位圖書館  – XML 系統應用

53

SQL vs. XQL

SQL XQL

The database is a set of tables. The database is a set of one or more XML documents.

Queries are done in SQL, a query language that uses tables as a basic model.

Queries are done in XQL, a query language that uses the structure of XML as a basic model.

The FROM clause determines the tables which are examined by the query.

A query is given a set of input nodes from one or more documents, and examines those nodes and their descendants.

The result of a query is a table containing a set of rows.

The result of a query is a set of XML document nodes, which can be wrapped in a root node to create a well-formed XML document.

Page 54: 數位圖書館  – XML 系統應用

54

Simple Query Example

 

Search Context

<novel> <front> <title>The Heart of Darkness</title> <author>Joseph Conrad</author> </front></novel>

Query novel

  

Result Set

<novel> <front> <title>The Heart of Darkness</title> <author>Joseph Conrad</author> </front></novel>

Page 55: 數位圖書館  – XML 系統應用

55

Why result documents?

• An XML document is easily parsed with a standard XML parser, so it can be transmitted as a single ASCII stream and parsed by the receiving application.

• An XML document can be displayed in a standard XML browser.

• An XML document can be stored in an XML repository.

• An XML document can be passed on to an XSL processor to perform transformations or do formatting.

Page 56: 數位圖書館  – XML 系統應用

56

XML Database

Page 57: 數位圖書館  – XML 系統應用

57

What is Native XML Database?

• Defines a (logical) model for an XML document

– The database is specialized for storing XML data

• Has an XML document as its fundamental unit of (logical) storage

– Documents in, documents out

• Is not required to have any particular underlying physical storage model

– May not actually be a standalone database at all

Page 58: 數位圖書館  – XML 系統應用

58

Native XML Database Features

• XML storage

• Collections

– Allow you to query and manipulate those documents as a set

• Queries

– XPath, XQL

• Updates

– Update portions of documents (XUpdate)

Page 59: 數位圖書館  – XML 系統應用

59

Native XML Database Products

• eXist: proprietary/relational

• ozone: object-oriented

• Tamino: proprietary/relational

• Xindice: proprietary

• X-Hive: object-oriented/relational