background image
HLT. Some search engines can use a level
of linguistic intelligence to retrieve results
that do not exactly match the search
parameters but include extensions of words
or even synonyms. Computer translation is
one of the most used (and most difficult)
applications for natural language processing.
Language `pairs' are used to match vocabu-
lary in two different languages; however, it
is clear that understanding word context
and grammatical idiosyncrasies is of utmost
importance in this area. One example of a
cross-lingual application is a search engine
that retrieves and translates pages that are
not in the language of the search parame-
ters. Cross-lingualism can be applied to
speech technology as well as NLP and
there are now several prototype cross-lin-
gual speech interfaces which allow speakers
of different languages to understand one
another. Speech recognition is already used
in telephone booking and transaction sys-
tems and works effectively in low-vocabu-
lary contexts. Speech synthesis has been
used for a number of years to ease commu-
nication from and to people with sensory
impairments (e.g. voice synthesis for people
who have difficulty speaking, text being
`spoken' by the computer for the partially
or un-sighted) and is now being used in
applications such as listening to e-mail.
peaker recognition or verification has
multiple applications, from assuring a
person's identity in telephone banking sys-
tems or cash machines to increased securi-
ty on locked workstations, PDAs.
s some of these examples demon-
strate, the result of improvement in
HLT is that the interface between human
and computer will begin to blur. Instead of
having to learn a programming language
to tell a computer exactly what you want
it to do, the computer may understand
your language. Machines will become
more transparent, allowing humans to
interact with them in a more natural way.
which uses NLP would probably only have
to deal with a small variety of requests
("Give me information","Book","Change",
"Cancel") and be able to recognise desti-
nations. A system similar to this one is
being developed by IBM (details can be
viewed at: http://www.research. which can process travel
bookings to 9000 destinations given verbal
commands in either English or French.
LT requires certain components of
language in order to function. A lex-
icon or corpus provides a large amount of
raw linguistic data a vocabulary - then
grammatical rules or statistical analyses are
applied to determine the likelihood of a
variety of meanings of the piece of lan-
guage.To build an accurate language
model, computers use this huge amount
of data both to recognise individual sounds
and to predict the most likely version of
the word being used. For example, based
on a statistical analysis of samples, a com-
puter will know that "There is a ..."
occurs much more commonly than "Their
is a... " or "They're is a..." and can use
this information to identify which spelling
is most likely to be correct. (In fact, as the
previous sentence was typed, the word
processing package automatically corrected
the two `wrong' sentence segments.)
ne current method of natural lan-
guage parsing is to produce a lan-
guage model comprising each original
`natural' sentence linked to its correspon-
ding meaning and parsing.This allows
computers to identify correlations between
more complex linguistic structures (word
orders, irregular grammatical constructions)
and intended meanings.
deally, HLT will eventually allow anyone
to use a computer simply by talking and
listening to it.There is huge potential for
the applications made possible by reliable
text-independent speaker recognition
where users could even speak in a different
language and expect to be recognised.
peaker identification
: this technology
overlaps with speaker verification, but
instead of a binary state (the voice either is
or isn't the person the computer is com-
paring it with) speaker identification can
isolate from a group of enrolled speakers
which one is currently speaking.
peaker classification
: this is the abili-
ty to perform analyses of the voices of
a group of unknown speakers and to per-
form generic tasks based on their voices.
Tasks could include identifying similar-
sounding speakers (e.g. regional accents),
highlighting all speech segments made by
the same person, or detecting when the
speaker changes.
peech processing also includes speech
output and its many applications.
Obviously, there are no issues of back-
ground noise when a computer converts
digitally stored text to speech; however,
pronunciation of heteronyms (words that
are spelt the same but can have different
pronunciations and meanings, such as bow,
tear) becomes an issue.
his technology will allow a computer
to understand the meaning inherent in
language, that is, not only to be able to con-
vert spoken words into text but to convert
the linguistic meaning of the sentence into
a form it can understand and on which it
can perform actions.This transformation of
natural language into formal language is
known as parsing and is one of the most
technically challenging aspects of HLT.
ike speech recognition, the more lim-
ited the number of possible meanings,
the easier it is to achieve a good success
rate with a natural language processing sys-
tem. For example, a travel booking system