background image
The XML Family
of Technologies
may place on an institution's existing technical and staffing infrastructures.The technolo-
gy demands are unlikely to be overly onerous or too resource-intensive in either the
short or long term.The benefits that XML brings to content representation and manage-
ment will enable heritage institutions to make effective long-term and varied use of their
information assets. It provides their user communities, from curators to visitors, with
richer and more flexible mechanisms for accessing and using XML-encoded content.
I n t ro d u c t i o n t o X M L
Background: Markup from SGML to XML
The great bulk of Web pages are written (or `encoded') in the Hypertext Markup
Language (HTML), a simple, effective and forgiving language.The Standard Generalised
Markup Language (SGML) is the `parent' language of HTML and many other descriptive
tag-sets. SGML files are composed of plain ASCII text combined with tags enclosed in
angled brackets e.g. <tag> meaning no special software is required in order to create
an SGML or an HTML file. For instance, text between the tags <bold> would appear in
a heavier typeface </bold> than text not so tagged.This characteristic facilitates the
accessibility and longevity of the materials, and makes these file types eminently suitable
for delivery across disparate networks.
The term markup historically referred to annotations or marks used within a text to
indicate layout and presentation to typists or printers. Contemporary usage of the word
has evolved to indicate any means for making an interpretation of a text explicit. SGML
markup enables users to create structured documents by tagging structural divisions act,
scene, stanza, line, stage direction or conveying information about display elements such
as font changes, line breaks, or columns.
First unveiled as a W3C Recommendation in 1998, the Extensible Markup Language
(XML) is a subset of SGML, intended to allow generic SGML `to be served, received,
and processed on the Web in the way that is now possible with HTML'.
SGML is more
customisable than XML, which makes it more flexible and more powerful, although it is
significantly more expensive to implement. Unlike SGML, XML was designed with
Internet delivery in mind.There are now relatively few new projects that start as SGML
applications, but many legacy applications are still in use, particularly in larger organisa-
tions such as healthcare trusts. Some very specific applications such as modelling com-
pound document sets
are more suited to SGML because of features that XML has
not inherited. For new cultural heritage applications, however, XML is likely to be by
some distance the more suitable of the two.
Principles of XML
A distinction should be drawn between two different kinds of markup: procedural and
descriptive. HTML is an example of a language that is used in a mainly procedural fashion,
TWR2004_01_layout#62 14.04.2004 14:07 Uhr Seite 42