28
XML query

XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Embed Size (px)

Citation preview

Page 1: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XML query

Page 2: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

introduction

• An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever they have stored in XML.

• XQuery is based on the structure of XML and leverages this structure to provide query capabilities for the same range of data that XML stores.

Page 3: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

• XQuery is defined in terms of the XQuery 1.0 and XPath 2.0 Data Model [XQ-DM], which represents the parsed structure of an XML document as an ordered, labeled tree in which nodes have identity and may be associated with simple or complex types.

• XQuery is a functional language

Page 4: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XML vs. Relational Data

{ row: { name: “John”, phone: 3634 }, row: { name: “Sue”, phone: 6343 }, row: { name: “Dick”, phone: 6363 }}

name phone

John 3634

Sue 6343

Dick 6363

row row row

name name name

phone phone phone

“John” 3634 “Sue” “Dick”6343 6363

Relation … in XML

Page 5: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Relational to XML Data

• A relation instance is basically a tree with:– Unbounded fanout at level 1 (i.e., any # of rows)– Fixed fanout at level 2 (i.e., fixed # fields)

• XML data is essentially an arbitrary tree– Unbounded fanout at all nodes/levels– Any number of levels– Variable # of children at different nodes, variable

path lengths

Page 6: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Query Language for XML

• Must be high-level; “SQL for XML”• Must conform to XSchema

– But also work in absence of schema info• Support simple and complex/nested datatypes• Support universal and existential quantifiers,

aggregation• Operations on sequences and hierarchies of doc

structures• Capability to transform and create XML structures

Page 7: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XQuery

• Influenced by XML-QL, Lorel, Quilt, YATL– Also, XPath and XML Schema

• Reads a sequence of XML fragments or atomic values and returns a sequence of XML fragments or atomic values– Inputs/outputs are objects defined by XML-Query

data model, rather than strings in XML syntax

Page 8: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Overview of XQuery• Path expressions• Element constructors• FLWOR (“flower”) expressions

– Several other kinds of expressions as well, including conditional expressions, list expressions, quantified expressions, etc.

• Expressions evaluated w.r.t. a context:– Context item (current node)– Context position (in sequence being processed)– Context size (of the sequence being processed)– Context also includes namespaces, variables, functions,

date, etc.

Page 9: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Path Expressions

Examples:• Bib/paper• Bib/book/publisher• Bib/paper/author/lastname

Given an XML document, the value of a path expression p is a set of objects

Page 10: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Path Expression Examples

Doc =

&o1

&o12 &o24 &o29

&o43

&o70 &o71

&96

&243 &206

&25

“Serge”“Abiteboul”

1997

“Victor”“Vianu”

122 133

paper bookpaper

references

references references

authortitle

yearhttp

author

authorauthor

title publisherauthor

authortitle

page

firstname lastnamefirstname

lastnamefirst last

Bib

&o44 &o45 &o46

&o47 &o48 &o49 &o50 &o51

&o52

Bib/paper = <&o12,&o29>Bib/book/publisher = <&o51>Bib/paper/author/lastname = <&o71,&206>

Bib/paper = <&o12,&o29>Bib/book/publisher = <&o51>Bib/paper/author/lastname = <&o71,&206>

Note that order of elements matters!

Page 11: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Element Construction

• An XQuery expression can construct new values or structures

• Example: Consider the path expressions from the previous slide.– Each of them returns a newly constructed

sequence of elements– Key point is that we don’t just return existing

structures or atomic values; we can re-arrange them as we wish into new structures

Page 12: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Data Model

• In the XQuery data model, every document is represented as a tree of nodes. The kinds of nodes that may occur are: document, element, attribute, text, name-space, processing instruction, and comment.

• An item is a single node or atomic value. A series of items is known as a sequence. In XQuery, every value is a sequence

Page 13: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Literals and comments

• (: Hello World :)

Page 14: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Doc() function

• Returns entire document doc("books.xml")

Page 15: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Locating nodes

• A path expression consists of a series of one or more steps, separated by a slash, /, or double slash, //.

• doc("books.xml")/bib/book

• doc("books.xml")//book

Page 16: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Predicates

• Predicates are Boolean conditions that select a subset of the nodes computed by a step expression.

• XQuery uses square brackets around predicates.

• For instance, the following query returns only authors for which last="Stevens" is true:doc("books.xml")/bib/book/author[last="Stevens"]

Page 17: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

• If a predicate contains a single numeric value, it is treated like a subscript. For instance, the following expression returns the first author of each book:

(doc("books.xml")/bib/book/author)[1]

Page 18: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Creating Nodes

document { <book year="1977"> <title>Harold and the Purple Crayon</title> <author><last>Johnson</last><first>Crockett </first></author> <publisher>HarperCollins Juvenile Books</publisher> <price>14.95</price> </book> }

Page 19: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

<titles count="{ count(doc('books.xml')//title) }"> { doc("books.xml")//title } </titles>

Page 20: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

FLWOR Expressions

• FOR-LET-WHERE-ORDERBY-RETURN = FLWOR

FOR / LET Clauses

WHERE Clause

ORDERBY/RETURN Clause

List of tuples

List of tuples

Instance of XQuery data model

Page 21: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

FOR vs. LET

• FOR $x IN list-expr – Binds $x in turn to each value in the list expr

• LET $x = list-expr – Binds $x to the entire list expr– Useful for common sub-expressions and for

aggregations

Page 22: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

FOR vs. LET: Example

FOR $x IN document("bib.xml")/bib/book

RETURN <result> $x </result>

FOR $x IN document("bib.xml")/bib/book

RETURN <result> $x </result>

Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ...

LET $x IN document("bib.xml")/bib/book

RETURN <result> $x </result>

LET $x IN document("bib.xml")/bib/book

RETURN <result> $x </result>

Returns:<result> <book>...</book> <book>...</book> <book>...</book> ...</result>

Notice that result hasseveral elements

Notice that result hasexactly one element

Page 23: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XQuery Example 1

Find all book titles published after 1995:

FOR $x IN document("bib.xml")/bib/book

WHERE $x/year > 1995

RETURN $x/title

FOR $x IN document("bib.xml")/bib/book

WHERE $x/year > 1995

RETURN $x/title

Result: <title> abc </title> <title> def </title> <title> ghi </title>

Page 24: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XQuery Example 2For each author of a book by Morgan

Kaufmann, list all books she published:

FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author)

RETURN <result>

$a,

FOR $t IN /bib/book[author=$a]/title

RETURN $t

</result>

FOR $a IN distinct(document("bib.xml") /bib/book[publisher=“Morgan Kaufmann”]/author)

RETURN <result>

$a,

FOR $t IN /bib/book[author=$a]/title

RETURN $t

</result>

distinct = a function that eliminates duplicates (after converting inputs to atomic values)

Page 25: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

Results for Example 2

<result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>

Observe how nested structure of result elements is determined by the nested structure of the query.

Page 26: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XQuery Example 3

count = (aggregate) function that returns the number of elements

<big_publishers>

FOR $p IN distinct(document("bib.xml")//publisher)

LET $b := document("bib.xml")/book[publisher = $p]

WHERE count($b) > 100

RETURN $p

</big_publishers>

<big_publishers>

FOR $p IN distinct(document("bib.xml")//publisher)

LET $b := document("bib.xml")/book[publisher = $p]

WHERE count($b) > 100

RETURN $p

</big_publishers>

For each publisher p- Let the list of books published by p be b

Count the # books in b, and return p if b > 100

Page 27: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

XQuery Example 4

Find books whose price is larger than average:

LET $a=avg(document("bib.xml")/bib/book/price)

FOR $b in document("bib.xml")/bib/book

WHERE $b/price > $a

RETURN $b

LET $a=avg(document("bib.xml")/bib/book/price)

FOR $b in document("bib.xml")/bib/book

WHERE $b/price > $a

RETURN $b

Page 28: XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever

FLWOER Expressions

for $b in doc("books.xml")//book where $b/@year = "2000" return $b/title