background image
DigiCULT
.
Info
28
CONCLUSION
T
he business of building a digital
archive, managing it in the long term,
and providing access to it is a complex and
costly operation. A digital archiving system,
which manages workflows, stores the digital
objects themselves, as well as descriptive,
administrative and preservation metadata,
and provides management reports, is essen-
tial to support it.This system must develop
hand in hand with the archive itself, grow-
ing to facilitate expanding business require-
ments and enhancing productivity at every
stage.The National Library of Australia is
about to embark on another development
cycle for its digital archiving system.
T
he `deep Web' presents special chal-
lenges to national libraries that are
endeavouring to identify and archive a
nation's output in this form, whether they
are taking the selective or comprehensive
approach (both of which rely on harvest-
ing robots).The National Library of
Australia has embarked on a research proj-
ect to find ways of taking in, managing
and providing access to publications and
Websites structured as databases. It is also
participating in the Deep Web Archiving
Group of the recently formed Interna-
tional Internet Preservation Consortium
35
and will collaborate with other members
in developing solutions and sharing results.
and databases from the `deep Web'
obtained as a result of the deep Web
archiving initiative described below;
· the need for a collection manager
interface to the preservation metadata,
and mechanisms to support preservation
processes; and
· the need for better and more flexible
reports from the system.
ARCHIVING THE `DEEP WEB'
T
o date, titles archived in PANDORA
have mostly been gathered using a
robot that follows html links from the root
URL submitted as the starting point for
gathering.
A
very large part of the Web consists of
sites not accessible via html links but
stored in structured databases, which it is
necessary to query for presentation of
information.These queries are typically
made via selection of options from drop-
down boxes or by entering terms into a
search engine.This requires intelligence
that gathering robots and search engine
indexers do not have at this stage.The hid-
den nature of data stored and accessed in
this way has led to its being known as the
`deep Web'.
PANDAS EVALUATION SYSTEM
Because libraries and other collecting institutions around the world are beginning to
take on the responsibilities of collecting digital publications, and because as yet there
are few digital archiving systems available to assist them, the National Library of
Australia has received a number of enquiries about the use of the PANDAS soft-
ware.The Library has decided to make the software available on a cost-recovery
basis, although it will be limited in the number of agencies it can accommodate.To
enable interested libraries to assess whether PANDAS does in fact meet their busi-
ness needs, the PANDAS Evaluation System has been developed and has recently
been made available to the first evaluating library. A maximum of three agencies at
any one time can evaluate the software.
BACK TO PAGE 1
35 The International Internet Preservation Consortium was established in July 2003 under the leadership of the
Bibliothèque Nationale de France (http://www.bnf.fr/).
T
HE
F
IRENZE
A
GENDA
(17 O
CTOBER
2003)
I
n response to the challenges of preserv-
ing digital memory, a group of experts
have proposed an agenda with focussed
objectives addressing creation, preservation
and access issues for both digitised and
born-digital objects.The Italian Presidency,
the European Commission, ERPANET,
and the MINERVA project are the pro-
moters of this initiative in the philosophy
of eEurope and linked to the National
Representatives Group (http://www.
cordis.lu/ist/ka3/digicult/nrg.htm).
T
he agenda covers a short period (12-
18 months) identifying concrete and
realistic actions which respect the interests
of museum, libraries and archives, and the
differences between media formats. It is
an open process integrating ongoing
actions and the voluntary efforts of par-
ticipants. The experts have identified
some initial responsibilities for each of the
actions, and progress will be reviewed in
one year. The Firenze agenda has been
submitted to the National Representatives
Group (NRG) for endorsement and to
encourage each Member State to support
the initiative.
A
ction Area 1 of the agenda addresses
problems and risks. Probably the
most important task today is to create
awareness about risks and problems among
decision-makers at all levels. Area 2 con-
siders ongoing initiatives and currently
available technologies, whereas Action
Area 3 examines the legal and regulatory
Reprinted without amendment by DigiCULT as the Agenda itself requested.