28 DigiCULT
A C
ASE
S
TUDY
OF
A
F
ACETED
A
PPROACH
TO
K
NOWLEDGE
O
RGANISATION
AND
R
ETRIEVAL
IN
THE
C
ULTURAL
H
ERITAGE
S
ECTOR
By Douglas Tudhope and Ceri Binding
I
NTRODUCTION
: K
NOWLEDGE
O
RGANISATION
S
YSTEMS
IN
D
IGITAL
H
ERITAGE
T
he trend within museums and digital heritage
institutions to unlock the information in their
collections involves opening up databases, previous-
ly the domain of the IT department, to a new range
of users. These might, for instance, be members of
the public searching a museum Website for informa-
tion relating to an object which has been in the fam-
ily for generations or they might be curators looking
to create a virtual exhibit
1
from the objects in the
collections database. There is a need for tools to help
formulate and refine searches and navigate through
the information space of concepts that have been
used to index the collection. When technical terms
are involved, a `controlled vocabulary' is generally
used to index the collection if both searchers and
indexers draw on the same standard set of words then
the synonym mismatch problems common with Web
search engines can be avoided. Controlled vocabular-
ies provide a means to standardise the terms used to
describe objects, by limiting the indexing vocabulary
to a subset of natural language.
These controlled vocabularies have long been part
of standard cataloguing practice in libraries and muse-
ums and are now being applied to electronic reposi-
tories via thematic keywords in resource descriptors.
Metadata sets for the Web, such as Dublin Core, typi-
cally include the more complex notion of the Subject
of a resource in addition to elements for Title, Cre-
ator, Date, etc. However, controlled vocabularies can
do more than simply supply a list of authorised terms.
They play a significant role, particularly when used to
provide a mediating interface between indexed col-
lections and users who may be unfamiliar with native
terminology. This applies to both existing collection
databases and new collections of records, which may
be `born digital' but can be categorised and indexed
using the same structures and techniques.
Knowledge is structured and organised so that a
user can explore a network of related concepts to
find the most appropriate one for a given situation.
The different types of Knowledge Organisation Sys-
tem (KOS) include classifications, gazetteers, lexi-
cal databases, ontologies, taxonomies and thesauri.
2
There is a vast existing legacy of these intellectual
knowledge structures (and indexed collections) to be
found within cultural heritage institutions. A library
might use the Dewey Decimal Classification (DDC),
for example, while a museum might use the Art and
Architecture Thesaurus (AAT). Other large, widely
used KOS include AGROVOC,
3
CABI,
4
Library of
Congress Subject Headings, MeSH,
5
and many oth-
ers.
6
On the other hand, a large number of small-
er KOS have also been designed to meet the needs
of specialist applications or subject areas. In the UK,
the mda (Museum Documentation Association)
7
has
facilitated the development of several specialised the-
sauri, such as the Archaeological Objects Thesau-
rus, the Railways Object Names Thesaurus and the
Waterways Object Names Thesaurus.
N
ETWORKING
KOS
SERVICES
T
he rich legacy of KOS makes it possible to
offer search options that go beyond the cur-
rent generation of Web search engines' minimal
assumptions on user behaviour. However, this will
require new thinking on the services that KOS can
offer to the digital environment. Traditionally, atten-
tion has focused on methods for constructing KOS,
with a view to their being used as reference mate-
rial in print form. New possibilities have emerged
1
e.g. the Science
Museum's Exhiblets: http://
www.sciencemuseum.org.
uk/collections/exhiblets/
index.asp
2
Gail Hodge gives a useful
summary: http://nkos.slis.
kent.edu/KOS_
taxonomy .htm
3
Food and Agriculture
Organization of the United
Nations: AGROVOC
Multilingual Thesaurus
(Arabic, Chinese, English,
Français, Español,
Português), http://www.
fao.org/agrovoc/
4
CAB International: CAB
Thesaurus, a controlled
vocabulary resource
[over 48,000 descriptive
terms] for the applied life
sciences, http://www.
cabi-publishing.org/
DatabaseSearchTools.
asp?PID=277
5
National Library of
Medicine: Medical Subject
Headings (MeSH), http://
www.nlm.nih.gov/mesh/
6
For indexes of KOS on
the Web, see http://www.
lub.lu.se/metadata/subject-
help.html and http://www.
w3.org/2001/sw/
Europe/reports/thes/
thes_links.html
7
mda (Museum
Documentation
Association), http://www.
mda.org.uk/index_rs.htm