background image
56 DigiCULT
data; automatic metadata generation; a redefinition of
information retrieval through other types of infor-
mation than descriptive ones; applications that allow
users to browse different granularity levels of infor-
mation; and content-based analysis.
However, Foulonneau spotted current RTD limita-
tions that included: `few adaptations of full-text infor-
mation retrieval tools to structured content; the same
for content-based image retrieval'; `few attempts to
promote multilingual access to cultural content' and,
last but not least, `lack of communication with other
sectors such as education'. Foulonneau saw the clear
`necessity to adapt technologies to cultural heritage
applications', and expected that around 2009 novel
tools could become available partly through adapta-
tion from other application areas, and 2011 could see
the development of a new generation of metadata-
based tools.
With a view on the automated processing of mas-
sive distributed bio- and ecological data, Renata
Arovelius, Head of Archives at the Swedish Univer-
sity of Living Natural Resources, stated that an ade-
quate metadata model, as well as a proper technical
solution, was missing. She thought that in 2005-
2008 more `international cooperation: scientists/ICT/
archive/library joint projects' would be necessary to
achieve `joint strategy/generic model and standards'
beyond 2008.
Jacques Bogaarts (Nationaal Archief, The Nether-
lands) saw a major RTD requirement for `tools that
automatically or semi automatically (expert support)
extract data from archival records (machine- or hand-
written, maps, etc.) There should also be tools to
convert these data to formats that are recognisable in
a modern context, for instance project old maps on
current ones, resolve complicated multi-layer index-
es to direct access (archivists will know what I mean).'
However, Boogarts expected that the necessary soft-
ware might become available only in 2020.
Martin Doerr (ICS-FORTH, Greece) stated: `The
biggest obstacle is the separation of scientists and
technology for automated and non-automated meth-
ods. Exactly as artificial intelligence has failed and
Information Retrieval stagnates since years, a purely
automated approach is doomed to fail. The quality of
purely automated methods is unacceptable to schol-
ars. Similarly, multi-modal methods are superior to
individual algorithms. There is virtually no research in
quality assessment of individual results of automated
methods: Can an algorithm separate out which item
is correctly treated by an automated method, and
which may be not? So that the investment in manu-
al intervention can be directed to the "difficult cases",
saving money without compromising quality.'
Doerr considered the major RTD step to be tak-
en as the development of `technologies that allow for
a graceful interaction of manual and automated pro-
cedures. Automated procedures are needed to initial-
ize material for manual processing, to refine manual
processing without destroying good human decision,
and to learn from human decisions. Only multimodal
techniques will be successful. Integration of automat-
ed learning, statistical methods, user behaviour eval-
uation.' If strategically addressed, major achievements
could be achieved in 2008-2010.
Kazimierz Schmidt (Adviser, State Archives, Poland)
provided DigiCULT with an extensive description
of requirements needed for the consistent creation,
management and integration of distributed herit-
age resources of which we can only mention a few
points. Schmidt envisaged that over the next 10-
15 years `precise finding in large distributed resourc-
es, existing in most different forms, and aggregated
in most different institutions will be possible, if we
agree common, bright standards of the classification
and description'. He added that `it is not enough to
accept standard markup language (like the XML fam-
ily) which let us to describe any structure (i.e. EAD)
we need to agree the structure of metadata'.
Schmidt pointed to the collaborative success of
Dublin Core, `but as every solution on so general
level this is not satisfactory information (in details)
for any institution professionally collecting records'.
Detailed descriptive models, in his view, should be
defined according to types of documents and used
across the different domains of heritage organisations
(e.g. for photographs, as prepared by the cross-domain
SEPIA project
79
). A further important requirement
would be to agree on how to construct stable, unique
ID numbers, and the process of linking this unique
ID to following versions of the same document; for
example, using an `original' ID as a part in subsequent
versions of a document, which would allow future
users to find not only individual documents but the
`family of its versions'.
Schmidt also addressed the need to change some
attitudes towards electronic resources. This included
the comment that `digital records very often are treat-
ed as documentation separate from the traditional
record files, photographic, audio-visual, sound materi-
al etc; and colleges (universities) teach "the electronic
documentation" as a separate subject'. Furthermore,
Schmidt urged not to let every institution build `their
own' digital collection, rather data centres should be
established, to which smaller local institutions should
also be enabled to connect.
79
SEPIA: Safeguarding
European Photographic
Images for Access, http://
www.knaw.nl/ecpa/sepia/;
see also the papers from
the SEPIA conference:
"Changing Images: the
role of photographic col-
lections in a digital age"
(18-20 September 2003),
which form a rich resource
on issues in (digital)
photographic heritage,
http://www.knaw.nl/ecpa/
sepia/conference.html
DCTHI7_271104.indd 56
06.12.2004 8:38:23 Uhr