XML

XML is an acronym for Extensible Markup Language, a markup language that can be extended and modified to cater for the needs of the context it is used in and the data content. XML standards are maintained by the World Wide Web Consortium (W3C). (Carey, 2007, p4)

The Structure of an XML Document

There are three main parts to an XML document:

Prolog

  • the XML declaration that indicates the document is written in XML
  • comment lines stating extra information about the contents
  • processing instructions
  • document type declarations

Document body

  • the documents content in a hierarchical type structure

Epilogue

  • any final comments or processing instructions

(Carey, 2007, p11)

XML Schema

A newer XML schema language, described by the W3C as the successor of DTDs, is XML Schema, or more informally referred to by the initialism for XML Schema instances, XSD (XML Schema Definition). XSDs are far more powerful than DTDs in describing XML languages. They use a rich datatyping system, allow for more detailed constraints on an XML document’s logical structure, and must be processed in a more robust validation framework. XSDs also use an XML-based format which makes it possible to use ordinary XML tools to help process them, although XSD implementations require much more than just the ability to read XML.

Criticisms of XSD include the following:

  • The specification is very large, which makes it difficult to understand and implement.
  • The XML-based syntax leads to verbosity in schema description, which makes XSDs harder to read and write.
  • Schema validation can be an expensive addition to XML parsing, especially for high volume systems.
  • The modeling capabilities are very limited, with no ability to allow attributes to influence content models.
  • The type derivation model is very limited, in particular that derivation by extension is rarely useful.

XML extensions

XPath makes it possible to refer to individual parts of an XML document. This provides random access to XML data for other technologies, including XSLT, XSL-FO, XQuery etc. XPath expressions can refer to all or part of the text, data and values in XML elements, attributes, processing instructions, comments etc. They can also access the names of elements and attributes. XPaths can be used in both valid and well-formed XML, with and without defined namespaces.

XInclude defines the ability for XML files to include all or part of an external file. When processing is complete, the final XML infoset has no XInclude elements, but instead has copied the documents or parts thereof into the final infoset. It uses XPath to refer to a portion of the document for partial inclusions.

  • XQuery is to XML what SQL and PL/SQL are to relational databases: ways to access, manipulate and return XML.
  • XML Namespaces enable the same document to contain XML elements and attributes taken from different vocabularies, without any naming collisions occurring.
  • XML Signature defines the syntax and processing rules for creating digital signatures on XML content.
  • XML Encryption defines the syntax and processing rules for encrypting XML content.
  • XPointer is a system for addressing components of XML-based internet media

References

1. Carey P, New Perspectives on XML, 2nd ed., Thompson, USA, 2007.

 
xml.txt · Last modified: 2007/11/04 14:14 by lizhang
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki