Easy Web Site Design

The future of the Web : XML (eXtensible Markup Language)

Home
Definitions & history
Why is it important?
Introduction to HTML
Basic HTML Code
HTML Tables
HTML Lists
HTML Forms
Standards: XHTML
Web graphics
Presentation - CSS
Enhancing the Web
Usability
Accessibility
Going live
Online payments
The Future: XML
Internet Glossary
On the Web

Suggested reading







More books on
XML

What is XML?

Like HTML, XML (extensible Markup Language) is a tagged markup language derived from SGML (Standard Generalized Markup Language).

Arising in response to the proliferation of computer networking exemplified by the Internet, XML (and its related technologies) is emerging as the standard in data storage, transmission and manipulation.

XML is an open standard, not owned by any one organization. XML encoded information is held as plain text, thus it is protected from obsolescence from being locked into proprietary formats.

Though generally intended to be created and read by computers, XML is intelligible to the human eye. Here's a very simple example:

<footballteams>
  <team manager="Benitez">
    <name>Liverpool</name>
    <ground>Anfield</ground>
  </team>
  <team manager="Mourinho">
    <name>Chelsea</name>
    <ground>Stamford Bridge</ground>
  </team>
  <team manager="Wenger">
    <name>Arsenal</name>
    <ground>Highbury</ground>
  </team>
</footballteams>

Like HTML, XML consists of tags, elements and attributes. Unlike HTML the element names are not pre-defined. XML element and attribute names can be used to represent any kind of information that can be described by text. This is what is meant by the extensible bit of its name.

XML documents usually (but don't have to) begin with the XML declaration:

<?xml version="1.0"?>

The rules of XML

  1. XML tag names are case sensitive, ie <email> and <Email> are different. By convention, XML tag names are usually written in lower case.
  2. Every opening tag must have a corresponding closing tag. Tags describing empty elements end with '/>', eg the line-break tag <br> of HTML is written <br /> in XML.
  3. Tags must be properly nested. Though <strong><italic>Hello!</strong></italic> might display in most Web browsers, in XML it would need to be re-written as <strong><italic>Hello! </italic></strong>.
  4. Attribute values must appear within quotes, eg <td width="100"> NOT <td width=100>.
  5. Every XML document must have one, and only one, root element. The root element encloses all other document content.

Well formed and Valid

XML documents may be described as "well formed" or "valid". A well formed document is one that complies with the rules of XML syntax described above. A valid document should be well formed, but also complies with a pre-defined structure described by another document called a Document Type Definition (DTD) or a Schema.

The point of DTDs and Schemas is that when an XML document is valid then both sender and receiver know exactly what to expect for purposes of communication. When XML data is to be manipulated the programmer knows exactly the form of input his program is to expect and also the form of output it should produce.

Although they do the same job DTDs and Schemas are different kinds of document. DTDs use an older language coming from SGML. Schemas, or more accurately XML Schemas use an XML language to define the structure of XML documents. XML Schemas can define document structure more finely than DTDs, they also avoid the need to learn yet another language, and being XML documents themselves can be created and manipulated by the many and increasing XML tools.

Content and Presentation

XML is used to represent content. It is unconcerned with how that content should be presented. Some Web browsers will display XML documents as a kind of tree structure (XML documents may be thought of as a tree, with elements containing other elements, attributes, and textual content), however the browser is free to display this tree in any way it chooses.

Cascading Style Sheets (CSS) may be applied to XML to describe how different elements should be displayed. However XML also offers more powerful technologies XSL (extensible Style Language), XSLT (XSL Transformations), and XSL-FO (XSL Formatting Objects) for describing how XML documents should be presented. For example, rather than merely presenting an XML document as is, XSLT allows elements to be selected and/or sorted according to detailed criteria.

Entities

Entities are parts of an XML document that represent something else. For example, &gt; represents the greater than symbol (>). This particular entity is used to prevent confusion with XML's tags. Entities may be parsed - ie processed as part of the document, or unparsed - eg binary data such as images, audio or video, which cannot be directly included in an XML document. One example of the use of a parsed entity is to hold something that is repeated frequently, eg a company's contact details. Rather than add this to every document it is defined once and referenced where needed. Should the details change they need only be updated once.

Namespaces

Namespaces are an optional method for identifying groups of elements and attributes.

To ensure uniqueness, namespaces are based on URIs (Uniform Resource Indicators), eg:

<teams xmlns:"http://web.twinisles.com/teams">

Namespaces do not actually reference the resource indicated by the uri, nor do they care whether any resources is actually located there. URIs are used only to ensure uniqueness.

Namespaces may be applied to an entire document and/or specific parts of the document. Where a namespace is applied to part of a document it overrides any others which have been applied at a higher level (eg to the root).

Namespaces are particularly useful when documents use the same element names to describe different things, eg a document holding information on both football and cricket teams. In this case namespace declarations require a prefix, eg

<teams xmlns:football="http://web.twinisles.com/football" 
xmlns:cricket="http://web.twinisles.com/cricket">

Elements then use a prefix to indicate which namespace they belong to, eg

<football:team manager="Wenger"> 
  <football:name>Arsenal</football:name>          
  <football:ground>Highbury</football:ground> 
</football:team>

More information on XML

Extensible Markup Language (XML) from the W3C

XML Tutorial from W3Schools

XML.com Where the XML community shares XML development resources and solutions, features timely news, opinions, features, and tutorials; the Annotated XML specification created by Tim Bray; authoring tools, XML developer resources, interactive forums...

XML.org Applying XML and Web Services Standards in Industry

USENET newsgroups:
comp.text.xml
microsoft.public.xml

back to top

© web.twinisles.com Questions? Comments? Contact info@twinisles.com