background image
DigiCULT 21
By Joost van Kasteren
`Most information retrieval systems are based on
poorly conceived notions of what users need to know
and how they should retrieve it. They do not reflect
real user needs and search behaviour.' According to
Pia Borlund we should adapt retrieval systems to the
user and not the other way around. This goes for sci-
entific libraries, which is the field she is working in,
and also for the cultural heritage sector with its col-
lections of objects and documents.
Borlund is an associate professor at the Royal
School of Library and Information Science in Den-
. She developed an alternative approach to the
evaluation of interactive information retrieval (IIR)
systems. Alternative to the still dominating evaluation
approach, which was developed in the sixties by Cyril
W. Cleverdon at the Cranfield Institute of Technolo-
gy, which focused on recall (= the fraction of relevant
documents retrieved) and precision (= the fraction of
retrieved documents that are relevant).
Borlund: `The Cranfield model treats information
need as static, i.e. entirely reflected by user request
and search statement, while in real life information
needs can be dynamic. If you start with a topic that is
new for you, you need to know what is going on, and
you need to develop the right jargon. While search-
ing, your information need "matures", so to speak; it
is dynamic in nature. It might become static. Many
researchers check every week what articles have been
pre-printed on the Web.'
Secondly, the Cranfield model promotes a rather
limited concept of relevance in the sense that it uses
only topical relevance. Borlund: `In my view relevance
is a multidimensional concept which is also dynam-
ic in nature. What are relevant changes over time?
So relevance is again about understanding user
needs and user behaviour. As a librarian you can never
know what is relevant to a user. You might know the
topic but you would not know why he is looking for
information on that topic nor his angle.'
Borlund developed a new method for evaluating
information retrieval systems taking into account the
dynamic nature of user needs and the multidimen-
sional and dynamic nature of relevance. It combines
a system-oriented approach like the Cranfield mod-
el with the user approach, which measures user's satis-
faction regardless of the actual outcome of the search.
The method involves 25 to 30 test persons who are
told a short `cover story' to trigger a simulated infor-
mation need. The cover story allows free user inter-
pretation of the situation and hence heterogeneous
information needs, to mirror a real world situation.
The cover story also functions as a platform for sit-
uation-dependent relevance. With the story in mind
people are asked to use the retrieval system to ful-
fil their simulated information need. Performance of
the system is then measured by assessing how well the
information needs of the test persons have been ful-
filled, both in objective terms (did they find all the
relevant information?) and in subjective terms (were
they satisfied with the result?). Relative Relevance and
Ranked Half Life are used as alternative performance
measures (alternative with respect to the aforemen-
tioned recall and precision measures) that can handle
non-binary degrees of relevance.
The evaluation method is used in the TAPIR (Text
Access Potentials for Interactive Information Retriev-
al) research project. The project investigated the use of
several cognitive representations of a document. One
representation might be the author's own perception
of the work expressed in title and full text. Another
might be derived from the indexing by a descriptor
or from citations given to the work by other authors.
The assumption is that the more cognitively differ-
ent the representations are that point towards a certain
document, the higher the probability that the docu-
ment is relevant to a given set of criteria. `These poly-
representation algorithms can be very useful in many
domains, including the cultural heritage sector.'
Borlund believes that an important guideline in
designing, developing and testing information retriev-
al systems is to start with the information need of
the user. Not only to unlock scientific literature, but
across other domains where collections are to be
made accessible. `You need to change perspective',
she says. `Retrieval is not about cataloguing objects
or information, but about information needs.'