XML (1).ppt

Embed Size (px)

Citation preview

  • 8/10/2019 XML (1).ppt

    1/74

  • 8/10/2019 XML (1).ppt

    2/74

  • 8/10/2019 XML (1).ppt

    3/74

  • 8/10/2019 XML (1).ppt

    4/74

    Looking Back

    Tightly coupled systems and communication

    Proprietary, closed protocols and methods

    Data sharing between 3rd party solutions unwieldy

    Non-extensible solutions

  • 8/10/2019 XML (1).ppt

    5/74

    XML!

    XML technologies introduced:XML 1.0 - Document Basics

    XML Schemata

    XSLT: Style sheets and Transformations

    .NET & XML:The System.Xml Namespace

  • 8/10/2019 XML (1).ppt

    6/74

    XML 1.0 - Document Basics

    What is XML?

    XML Tags and Tag Sets

    Components of an XML Document

    Document Instance

    XML Document by Example

    The XML Parser

  • 8/10/2019 XML (1).ppt

    7/74

  • 8/10/2019 XML (1).ppt

    8/74

    What is XML?2/2

    Designed for describing and interchanging dataData is logically structured

    Human readable, writeable and understandable text file!

    Easy to Parse; Easy to Read; and Easy to Write!

    Metadata:Data that describes data; data with semantics

    Looks like HTMLbut it isnt!

    Uses tags to delimit data and create structureDoes not specify how to display the data

  • 8/10/2019 XML (1).ppt

    9/74

    XML Tag-SetsBegin with and end with

    Can have an empty element: < someTag />

    Exceptions are:XML document declaration:

    Comments: The document type declaration

    < DOCTYPE [ ... ]>

    Definition of document elements in an Internal DTD:

    , , etc

    Promote logical structuring of documents and dataUser definable

    Create hierarchically nested structure

  • 8/10/2019 XML (1).ppt

    10/74

    Components of an XML Document 1/3

    XML Processing Instruction

    Document Type Declaration

    Document Instance

  • 8/10/2019 XML (1).ppt

    11/74

    Components of an XML Document 2/3

    XML Processing Instruction

    version information

    encoding type: UTF-8 , UTF-16, ISO-10646-UCS-2, etc

    standalone declaration; indicates if there are external file references

    Namespace declaration(s), Processing Instructions (for applications), etc

  • 8/10/2019 XML (1).ppt

    12/74

    Components of an XML Document 3/3

    Document Type Declaration. Two types: An Internal declaration

    An External reference

    Document InstanceThis is the XML document instance

    Read as: the XML -ized data

  • 8/10/2019 XML (1).ppt

    13/74

    Document Instance: The Markup

    Document Root ElementRequired if a document type declaration exists

    Must have the same name as the declaration

    ElementsCan contain other elements

    Can have attributes assigned to them

    May or may not have a value

    AttributesProperties that are assigned to elements

    Provide additional element information

  • 8/10/2019 XML (1).ppt

    14/74

    XML By Example: A Document < DOCTYPE CustomerOrder

    SYSTEM http://www.myco.com/dtd/order.dtd >

    Olaf

    Smith

    91 Park So, New York, NY 10018

    Hauptstrasse 55, D-81671 Munich

    10 100 200

    < -- More s ... -->

  • 8/10/2019 XML (1).ppt

    15/74

    XMLData + DTD

    Not Valid!DTD< -- XML Data-->

    Some 100 101

    < -- XML Data-->

    Some Thing

    Valid

  • 8/10/2019 XML (1).ppt

    16/74

    Whats a DTD?

    Document Type Definition (DTD)Defines the syntax, grammar & semantics

    Defines the document structure

    What Elements, Attributes, Entities, etc are permitted?

    How are the document elements related & structured?

    Referenced by or defined in XML documents, but its not XML!

    Enables validation of XML documents using an XML Parser

    Can be referenced to by more than one XML documentDTDs may reference other DTDs

  • 8/10/2019 XML (1).ppt

    17/74

  • 8/10/2019 XML (1).ppt

    18/74

    DTD By Examplehttp://www.myco.com/dtd/order.dtd

    < DOCTYPE CustomerOrder [

    < ELEMENT CustomerOrder (Customer, Orders*) >

    < ELEMENT Customer (Person, Address+) >< ELEMENT Person (FName, LName) >< ELEMENT FName (#PCDATA) >< ELEMENT LName (#PCDATA) >< ELEMENT Address (#PCDATA) >< ATTLIST Address

    AddrType ( billing | shipping | home ) shipping >

    < ELEMENT Orders (OrderNo, ProductNo+) >< ELEMENT OrderNo (#PCDATA) >< ELEMENT ProductNo (#PCDATA) >

    ]>

  • 8/10/2019 XML (1).ppt

    19/74

    XML Parser in Action!

    Browser orApplication

    XMLParser

    XMLSchema

    OrDTD

    XMLSource

    Document

    ValidatedXML

    Document

  • 8/10/2019 XML (1).ppt

    20/74

    The XML Parser: What is it?

    Used to Process an XML DocumentReads, parses & interprets the DTD and XML document

    Performs substitutions, validation or additional processing

    Knows the XML language rules and can determine:Is the document Well-Formed?

    Is it Valid?

    Creates a Document Object Model (DOM) of the instanceProvides programmatic access to the DOM or instance

  • 8/10/2019 XML (1).ppt

    21/74

    What is the DOM?

    DOM stands for Document Object Model

    Programming interface for HTML & XML documents

    An in-memory representation of a document

    Defines the document structure through an object modelTree-view of a document

    Nodes, elements and attributes, text elements, etc

    W3C defined the DOM Level 1 and Level 2 Corehttp://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/

    http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/

  • 8/10/2019 XML (1).ppt

    22/74

    Generating The DOM

    Parser

    XMLDocument

    Dom Tree

    Root ElementChild Element

    TextChild Element

    Text

  • 8/10/2019 XML (1).ppt

    23/74

    Where Do You Find XML Parsers?

    Transparently built into XML enabled productsInternet Explorer, SQL Server 2000, etc

    All over the Internet!

    Microsoft XML Parserhttp://msdn.microsoft.com/xml/general/xmlparser.asp

    IBM/Apache Xerceshttp://xml.apache.org

    http://alphaworks.ibm.com

    http://msdn.microsoft.com/xml/general/xmlparser.asphttp://xml.apache.org/http://alphaworks.ibm.com/http://alphaworks.ibm.com/http://xml.apache.org/http://msdn.microsoft.com/xml/general/xmlparser.asp
  • 8/10/2019 XML (1).ppt

    24/74

    XML Schema

    Whats a Schema? Schema vs. DTDs

    Datatypes & Structure

  • 8/10/2019 XML (1).ppt

    25/74

    XML Documents + XML Schema

    Not Valid!-- Some XML Schema -->

    < -- XML Data-->

    Some 100 101

    < -- XML Data-->

    Some Thing

    Valid

  • 8/10/2019 XML (1).ppt

    26/74

    Whats a Schema?

    Websters Collegiate Dictionary defines it as: A diagrammatic presentation; a structured framework

    The XML world defines it as: A structured framework for your XML Documents!

    A definition language - with its own syntax & grammar

    A means to structure data and enhance it with semantics!

    Best of all: Its an alternative to the DTD!

    Composed of two parts:Structure : http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/

    Datatypes : http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/

  • 8/10/2019 XML (1).ppt

    27/74

    Schema vs. DTDs

    Both are XML document definition languagesSchemata are written using XML

    Unlike DTDs, XML Schema are Extensible like XML!

    More verbose than DTDs but easier to read & write

  • 8/10/2019 XML (1).ppt

    28/74

    Datatypes & Structure

    Defining datatypesThe simple or primitive datatypes

    Based on (or derived) from the Schema datatypes

    Complex types

    Facets

    Declaring data types

    by example

  • 8/10/2019 XML (1).ppt

    29/74

    XML Schema Datatypes

    Two kinds of datatypes: Built-in and User-defined

    Built-inPrimitive Datatypes

    string , double , recurringDuration , etc

    Derived Datatypes:CDATA, integer , date , byte , etc

    Derived from the primitive types

    Example: integer is derived from double

    User-definedDerived from built-in or other user-defined datatypes

  • 8/10/2019 XML (1).ppt

    30/74

    The Simple Type:

    The Simplest Type Declaration:

    Based on a primitive or the derived built-in datatypes

    Cannot contain sub-elements or attributesCan declare constraining properties (facets)

    minLength, maxLength, Length, etc

    May be used as base type of a complexType

  • 8/10/2019 XML (1).ppt

    31/74

    The Complex Type:

    Used to define a new complex typeMay be based on simple or existing complexTypes

    May declare elements or element references:

    May declare attributes or reference attribute groups

  • 8/10/2019 XML (1).ppt

    32/74

    Defining a complexType By Example

    < -- AddrType attribute not shown -->

  • 8/10/2019 XML (1).ppt

    33/74

    More Complex Types

    DerivationsimpleContent complexContent

    Extension & Restriction (well see some of this)

    Substitution Groups

    Abstract Elements and Types

  • 8/10/2019 XML (1).ppt

    34/74

    The Many Facets of a Datatype!

    http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/

    A way to constrain datatypesConstrain the value space of a datatype

    Specify optional properties

    Examples of Constraining Facets:precision, minLength,enumeration , ...

  • 8/10/2019 XML (1).ppt

    35/74

    Declaring ElementsElements are declared using the tag

    Based on either a simple or complex type

    May contain simple or other complex types

    May reference an existing element

  • 8/10/2019 XML (1).ppt

    36/74

    Declaring AttributesDeclared using tag

    Value pairs

    Can only be assigned to types

    May be grouped into an attribute group more later!

    Based on a , by reference or explicitly

    < -- OR -->

  • 8/10/2019 XML (1).ppt

    37/74

    Declaring Attribute Groups 1/2

    Way to group related attributes together

    Promotes logical organization

    Encourages reuse defined once, referenced many times

    Facilitates maintenance

    Improves Schema readability

    Must be unique within an XML Schema

    Referenced from complexType definitions

  • 8/10/2019 XML (1).ppt

    38/74

    Declaring Attribute Groups 2/2

    < -- Define the unique group: -->

    < -- Then you can reference it from a complexType: -->

  • 8/10/2019 XML (1).ppt

    39/74

    Schema Namespaces

    Equivalent to XML namespaceshttp://www.w3.org/TR/1999/REC-xml-names-19990114/

    Used to qualify schema elements

    must itself be qualified with the schema namespace

    Namespace may have a namespace prefix for the schemaPrefix qualifies elements belonging to the targetNamespace

  • 8/10/2019 XML (1).ppt

    40/74

    targetNamespace Attribute

    targetNamespace attributeDeclares the namespace of the current schema

    This must be a universally unique Universal Resource Identifier (URI)

    Helps the parser differentiate type definitions

    Used during schema validationDifferentiates differing schema vocabularies in the schema

    targetNamespace:namespace_prefix = some_URI...

    Should match the schema namespace declarationExample:

    targetNamespace:CO ="http://www.myCo.com/CO"

  • 8/10/2019 XML (1).ppt

    41/74

    XML By Example

    < -- Declare the root element of our schema -->

    < -- Further Definitions & declarations not shown -->

  • 8/10/2019 XML (1).ppt

    42/74

    Follow the Yellow Brick XPath

    Specification found at:http://www.w3.org/TR/1999/REC-xpath-19991116

    Language used to address parts of an XML document

    Permits selection of nodes in an XML documentUses a path notations like with URLs

    Absolute paths: /CustomerOrder/Orders

    Relative paths: Orders

  • 8/10/2019 XML (1).ppt

    43/74

    Roadmap To Selection

    Location Syntaxaxis :: node_test [ predicate ]

    Location PathsAxis : Defines from where to start navigating

    parent , child , ancestor , attribute , / (the document), etc

    Node test : Selects one or more nodesBy tag name, node selector or wildcard ( *)

    node( ) , text( ) , comment( ) , etc

    Predicates : Optional function or expression enclosed in [... ] position( ) , count( ) , etc

    Example: /Address:: * [@AddrType=billing]

  • 8/10/2019 XML (1).ppt

    44/74

    Taking XPath Shortcuts

    Abbreviated Syntax existsThe following are equivalent

    OrderNo[position=1]/ProductNo[position=3]OrderNo[1]/ProductNo[3]

    .. instead of parent::node()

    . instead self::node()

    // instead of /descendant-or-self::node()/

  • 8/10/2019 XML (1).ppt

    45/74

    Operators

    To select an attribute value use @CustomerOrder/Customer/Address[@AddrType]

    To select the value of an element use $CustomerOrder/Orders/ProductNo[1][$ProductNo]

    Can compare objects arithmetically< (for ),

    Conditional processing< xsl:if test = >

    < xsl:choose >, ,

    Sorting

    Etc

  • 8/10/2019 XML (1).ppt

    55/74

    Brief look at XML in .NET

    .NET Support for XMLXML Namespaces in .NET

    Some XML Classes in .NET

  • 8/10/2019 XML (1).ppt

    56/74

    .NET Supports XML!XML 1.0

    http://www.w3.org/TR/1998/REC-xml-19980210

    XML Namespaceshttp://www.w3.org/TR/1999/REC-xml-names-19990114/

    XML Schemashttp://www.w3.org/TR/2001/REC-xmlschema-1-20010502/

    http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/XPath expressions

    http://www.w3.org/TR/1999/REC-xpath-19991116

    XSL/T transformationshttp://www.w3.org/TR/1999/REC-xslt-19991116

    DOM Level 1 and Level 2 Corehttp://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/

    http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/

    SOAP 1.1http://msdn.microsoft.com/xml/general/soapspec.asp

  • 8/10/2019 XML (1).ppt

    57/74

    XML Namespaces in .NET

    System.Xml

    .Serialization

    .Schema

    .XPath

    .Xsl

    l

  • 8/10/2019 XML (1).ppt

    58/74

    System.Xml Namespace

    Overall namespace for classes that provide XML supportClasses for creating, editing, navigating XML documents

    Reading, writing and manipulating documents via the DOM

    Use the XmlDocument class for XML documentsUse the XmlDataDocument class relational or XML data

    Classes that correspond to every type of XML element: XmlElement, XmlAttribute, XmlComment, etc

    Used by the XmlDocument and XmlDataDocument classes

    l d

  • 8/10/2019 XML (1).ppt

    59/74

    XmlReader Abstract base class for reading XML

    Fast, forward-only, non-cached XML stream reader

    Base class for XmlTextReader

    Properties of InterestValue : Gets the value of the node

    NodeType : Returns the type of node

    HasValue : Returns true if the node has a value

    LocalName : Gets the name of the node without its prefix

    ReadState : Returns the ReadState of the streamClosed, EndOfFile, Error, Initial or Interactive

    X lW i

  • 8/10/2019 XML (1).ppt

    60/74

    XmlWriter

    Abstract base classes for writing XMLFast, forward-only, non-cached XML stream writer

    Base classes for XmlTextWriter

    Properties of InterestWriteState : Returns the WriteState of the stream

    Attribute, Content, Element, etc

    XmlLang : Returns the current xml:lang scope

    XmlSpace : Returns the current xml:space

    X lT R d & X lT W i

  • 8/10/2019 XML (1).ppt

    61/74

    XmlTextReader & XmlTextWriter

    Derived from the XmlReader & XmlWriter abstract classesImplement all the functionality defined by their base classes

    Designed to work with a text based stream

    As opposed to an in-memory DOMInherit the properties of the XmlReader and XmlWriter

    XmlTextReader methods support reading XML elementsRead, MoveToElement, ReadString, etc

    XmlTextWriter methods support writing XML elementsWriteDocType, WriteComment, WriteName, etc

    X lD

  • 8/10/2019 XML (1).ppt

    62/74

    XmlDocumentDerived from the XmlNode class

    Represents an entire (in memory) XML document

    Supports DOM Level 1 and Level 2 Core functionality

    Reading & writing built on top of XmlReader & XmlWriter

    Load a document and generate the DOMUsing: URI, file, XmlReader , XmlTextReader or Stream

    P i & M h d f I

  • 8/10/2019 XML (1).ppt

    63/74

    Properties & Methods of InterestProperties of Interest:

    ChildNodes : Returns all the children of the current nodeDocumentType : Gets the DOCTYPE declaration node

    DocumentElement : Returns the root XmlElement

    XmlResolver : Used to resolve DTD & schema references

    FirstChild : Returns the first child of the current nodeParentNode : Returns the parent of the current nodeValue : Returns the (string) value of the current node

    Methods of Interest

    CreateComment : Creates a comment node CreateElement : Creates an element node Load : Loads XML data using a URL, file, Stream , etc

    Save : Saves the XML document to a file, Stream , orwriter

    X lD & h NET DOM

  • 8/10/2019 XML (1).ppt

    64/74

    XmlDocument & the .NET DOMSystem.Xml

    .Serialization.Schema.XPathXsl

    EntityHandlingFormattingNameTableReadStateTreePositionValidationWriteStateXmlAttributeXmlAttributeCollectionXmlCDataSectionXmlCharacterData

    XmlCharTypeXmlCommentXmlConvert

    XmlDataDocumentXmlDeclarationXmlDocumentXmlDocumentFragmentXmlDocumentTypeXmlElementXmlEntityXmlEntityReferenceXmlNamedNodeMap

    XmlNodeXmlNodeReaderXmlNodeType

    XmlNotationXmlReaderXmlSpaceXmlTextXmlTextReaderXmlTextWriterXmlUrlResolverXmlWhitespaceXmlWriter...

    X lD t B E l

  • 8/10/2019 XML (1).ppt

    65/74

    XmlDocument By Exampleusing System.Xml;

    //Create an XmlDocument, Load it, Write it to the Console//One way:XmlDocument xDoc = new XmlDocument();xDoc.Load( C: \\myData.xml");xDoc.Save( Console.Out);

    //Second way (Use a XmlTextReader to read in load the XML):XmlTextReader reader = new XmlTextReader(C: \\myData.xml");xDoc.Load( reader );xDoc.Save( Console.Out);

    //Third way (Use a XmlTextWriter to output the XML document):XmlTextWriter writer = new XmlTextWriter( Console.Out );writer.Formatting = Formatting.Indented;xDoc.WriteContentTo( writer );writer.Flush();Console.WriteLine();writer.Close();

    S t X l X l N

  • 8/10/2019 XML (1).ppt

    66/74

    System.Xml.Xsl Namespace

    Provides support for XSL TransformationsSome of the classes:

    XsltTransform : Transforms using a stylesheet

    XsltException : Used to handle transformation exceptionsXsltContext: The XSLT processors execution context

    X ltT f

  • 8/10/2019 XML (1).ppt

    67/74

    XsltTransform

    Four simple steps to perform a transformationInstantiate a XsltTransform object

    Load a stylesheet

    Load the dataTransform!

    T f ti B E l

  • 8/10/2019 XML (1).ppt

    68/74

    Transformation By Example

    Using System.Xml.Xsl;

    // 1. Create a XslTransform objectXslTransform xslt = new XslTransform();

    // 2. Load an XSL stylesheetxslt.Load("http://somewhere/favorite.xsl");

    // 3 & 4. Load the XML data file & transformxslt.Transform(http://somewhere/mydata.xml,

    C: \\somewhere_else\\ TransformedXmlOutput.xml);

    S

  • 8/10/2019 XML (1).ppt

    69/74

    Summary

    XML is powerful, flexible, open & extensibleXML is easy to learn easy to read & easy to use

    XML, XML Schema and XSLT combine to let you

    Have data with semanticsDictate and enforce you data structure

    Separate data and data representation

    Easily transform your data

    .NET is XML -ized .NET lives on XML!Not only exposes XML functionality, built using it

    Section 4: Q&A

  • 8/10/2019 XML (1).ppt

    70/74

    Section 4: Q&A

    Document Object Model (DOM)

  • 8/10/2019 XML (1).ppt

    71/74

    Document Object Model (DOM)1/2

    Use an XML parser to generate and manipulate the DOMLoad an XML file using a parser

    Use the parsers programming interface to: Navigate through the Document Object Model

    Manipulate the DOM: Add, Delete, Move, Modify DOM elements

    Using a DOM the parser can insure well formed documents

    Some parsers can validate the DOM Validating Parser

    By loading and comparing to either a DTD or a Schema

    Document Object Model (DOM)

  • 8/10/2019 XML (1).ppt

    72/74

    Document Object Model (DOM)2/2

    System.Xml contains DOM related classesXmlDocumentXmlDataDocumentXmlNavigator

    XmlDataNavigatoretc

    .NET supports DOM Level 1 and most of the Level 2 Corehttp://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/

    http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113/

    XML Namespaces 1/2

  • 8/10/2019 XML (1).ppt

    73/74

    XML Namespaces 1/2

    Another W3C specificationhttp://www.w3.org/TR/REC-xml-names/

    Create collection of tags that share the same semantics

    Used to qualify tags that would otherwise collideMultiple documents can use the same tag differently

    For example:Document A may use to designate a persons name

    Document B may use to designate a file name

    XML Namespaces 2/2

    http://www.w3.org/TR/REC-xml-names/http://www.w3.org/TR/REC-xml-names/http://www.w3.org/TR/REC-xml-names/http://www.w3.org/TR/REC-xml-names/http://www.w3.org/TR/REC-xml-names/http://www.w3.org/TR/REC-xml-names/
  • 8/10/2019 XML (1).ppt

    74/74

    XML Namespaces 2/2

    A URI is used to uniquely identify a namespacexmlns=urn:schemas -microsoft- com:customerdata

    May assign a namespace prefix to the namespacexmlns:ms=urn:schemas -microsoft- com:data

    Use the prefix to differentiate elements & attributes John Smith

    Documents have a default namespace

    Default prefix is xmlns