background image
The most pressing digitisation challenges: Volume and scalability
"It is necessary to understand that the nature of things change as the scale grows."
Mark Jones,Victoria and Albert Museum, UK, DigiCULT Interview, August 9-10, 2001
One of the most pressing issues related to digitisation is the volume of European cultural
heritage material. Although the cost related to digitising those treasures have, from a
pragmatic point of view, solved some of the problems, the amount of material is still too
large to pursue a universal approach to digitisation.
At present, the sheer volume of information causes serious scalability problems. As
Edmund Lee, Data Standards Supervisor at the National Monuments Records Center, UK,
brought it to the point:"We have the technology to scan an aerial photo, rectify it and place
it on a GIS, but how do you do that with seven million aerial photos?" (DigiCULT ERT,
Stockholm, June 14, 2001)
"The main issue is to reduce the reliance on human labour and intervention exclusively.
It is essential to build automated tools that can augment human processes and procedures.
Workflow management systems that incorporate automated tools and human labour can
address scale issues." (Sayeed Choudhury, John Hopkins University, USA; DigiCULT
Delphi, June 1, 2001)
Automated processes and routines would be a solution to improve the performance
problem with large record sets and also problems with version management.This includes
the possibility to include metadata right at the point of digitisation. Most current software
tools do not support this function, with the effect that digitised images cannot be searched
Metadata integration at the point of digitisation
The Royal Library, National Library of Sweden regularly digitises images at high resolution
on demand. As the current software in use does not support the immediate integration of
metadata at the point of digitisation, there would be a need to somehow transfer the images
to a librarian who is able to enter metadata which is a time and resource consuming
undertaking. As there is no possibility to discover a particular digitised image at a later point
due to missing metadata, the National Library made it a policy to throw out the digitised
images instead of archiving them. For the library, digitising the image is a second time when
requested is more economical than trying to locate the image in the maze of other digitised
Another requirement for managing the increasingly large amounts of data is the avai-
lability of cheap mass storage as well as access to a broadband infrastructure to transport
large amounts of data.