background image
DigiCULT
.
Info
23
the collective annotations on 4000 images
(8000 pages). This demo is also available in
the reading rooms of the Archives départemen
-
tales de la Mayenne (http://www.cg53.fr/Fr/
Archives/) and the Archives départementales
d'Ille-et-Vilaine (http://www.culture.gouv.fr/
culture/nllefce/fr/rep_ress/ad_35700.htm).
REGISTER OF MILITARY FORMS
Automatic Geometric Annotations
We also worked on damaged military
enrolment forms from the nineteenth cen-
tury. Various problems make them difficult
to recognise: the size of the cells changes
from year to year, there are a lot of pasted
sheets of paper which hide the form struc-
ture, stamps are stuck on, and ink has bled
through the paper. Therefore we defined
a specific EPF grammar which takes into
account these difficulties. From this specif-
ic (and small) EPF grammar, we automati-
cally produced a new recognition system to
detect the form structure (see Figure 4). We
successfully tested this recognition system
on 88,725 forms from the Archives dépar
-
tementales de la Mayenne and the Archives
départementales des Yvelines (http://www.
cg78.fr/archives/): 98.83% (87,692) of
forms showed correct detection of cell
positions, with no error. Each cell produces
an automatic annotation: a geometric anno-
tation (the polygon of the cell) and a textu-
al annotation (the name of the cell).
Automatic Annotations on
Handwritten Text
We worked on the automatic indexing of
last names in those military forms with the
help of cell locations (some examples of
names are given in Figure 5). Dictionaries
were not used as they cannot be exhaus-
tive. Using the method previously stated, a
user can make a textual request and the sys-
tem selects the closest match. On 350 dif-
ferent last names, around 200 names are
returned in first position when they are
used as request and 80% (280 names) occur
in the first ten results. This offers an auto-
matic document retrieval by handwritten
last names (see Figure 5). Less than 1 sec-
ond is needed to search for a name from
5000 images.
Collective Annotations
By changing the configuration file on
the platform, it is possible to specify the
allowed annotations on these military
forms. For example, the cell containing
birth information or the cell containing a
physical description of the person could
be collectively annoted. The user must
select the cell (an automatic annotation) to
zoom on it and to associate some textual
annotations to it. All these annotations
could then be used for a future query by
another reader.
Applications
The 60,000 images (automatically cropped
to remove protected information by using
the geometric annotations produced auto-
matically) are publicly available on the
Archives départementales de la Mayenne Web
site (http://www.cg53.fr/Fr/Archives/; fol-
low Archives en ligne then Conscrits de la
Figure 4: Some examples of last names on which automatic
annotations are made.
Figure 5: Access by handwritten names in military forms. The request is
´lefevre` from 503 pages. Answers are presented on the left (four ´lefevre`
are automatically selected). The reader can annotate cells if he wishes.
©
IMADOC
,

2004
©
IMADOC
,

2004