background image
mercial publications.To protect the pub-
lishers' interests, access to these titles is
restricted during the period of commercial
viability nominated by the publisher.
During this period the titles are available
for consultation in the reading room of the
archiving partner only, and the PANDO-
RA digital archiving system manages this
process automatically.
o achieve the greatest possible efficien-
cy in creating a collaborative, selective
archive, the National Library of Australia has
devoted considerable effort to developing a
digital archiving system to manage the
process.The original intention was to buy
a system off the shelf, as defined in the
Information Paper, issued as part of the
Digital Services Project.
The initiative did
not find any system that met PANDORA's
needs and the Library had no alternative
but to develop one itself. In June 2001 the
Library implemented the first version of the
PANDORA Digital Archiving System
(PANDAS) and version 2 followed in
August 2002. Subsequent enhancements
have brought us to version 2.1.3.
he PANDAS software supports the
following archiving functions:
Managing the metadata about titles that
have been both selected and rejected for
inclusion in the archive;
initiating the gathering of titles selected
for archiving;
managing the quality checking and
problem fixing process;
preparing items for public display and
generating a title entry page;
managing access restrictions; and
providing management reports.
he software is Web-based and enables
all eight partners from their remote
locations to carry out all of the tasks nec-
essary to download and store titles in the
central archive located on the National
sion to archive and make publications
available via the Web has been negotiat-
ed with the publishers.
The `significant properties' of resources
within the archive can be analysed and
determined both for individual resources
and for classes of resources.This
enhances our knowledge of preservation
requirements and enables strategies for
preservation to be put into place.
any of the titles in the Archive are
gathered on a regular basis to cap-
ture new content, and each new gathering
is referred to as an `instance'.The PAN-
DORA Archive is now over half a terabyte
in size, contains almost 5,000 titles and
over 9,000 `instances'.
ith the permission of publishers,
titles are harvested using HTTrack,
a freely available Website offline browser
and mirroring tool.
A small number of
titles are not available on the Web but are
distributed by e-mail, and these are
received directly from the publisher. Most
titles in the Archive are freely available on
the Web, but approximately 100 are com-
Publications of tertiary education
Titles referred by indexing and
abstracting agencies
Topical sites:
(a) sites in nominated subject areas
(defined in Appendix 2 of the
selection guidelines) that will be
collected on a rolling three-year
basis; and
(b) sites documenting key issues of
current social or political interest,
such as election sites, Sydney
Olympics, the Bali bombing.
he selective approach to archiving
enables PANDORA partners to
realise some important objectives:
Each item in the Archive is quality
assessed and functional to the fullest
extent permitted by current technical
and resource capabilities.
Each item in the Archive can be fully
catalogued and therefore can be included
in the national bibliography.
Each item in the Archive can be made
accessible owing to the fact that permis-
PANDORA interface showing the results of a search for Ayers Rock.
32 Xavier Roche, HTTrack Website Copier: Open Source Offline Browser (
33 National Library of Australia, Digital Services Project: Information Paper (1998).