background image
3. Layer 3.Wrappers: Every Wrapper is associated to a particular database. Its tasks are to
query it and retrieve the answers. For these purposes, this layer uses the Mapping Tree of
the database. Although all the Wrappers perform similar tasks, they need to be adapted to
the specific associated database.
4. Layer 4. Document Database:The databases that can be integrated in our system are pre-
existing and independent of it. Moreover, in our case, two of the three databases are
currently available through the Internet with a particular user interface that allows them
to be used as Digital Libraries.The fact of federating these Digital Libraries in our system
does not mean that they cannot answer other queries coming from their own interface.
Likewise, we must note that if a database has text retrieval capabilities, our system is
capable of exploiting them, but any needed preprocess, such as indexing must already be
performed and those text retrieval techniques have to be already implemented.That is to
say, managing the databases is not a task of our system.
Trees
The execution of every module in the federation is guided by the information stored in a set
of XML files, denoted here as Trees because of their hierarchical representation.All Trees are com-
posed of nodes that represent searchable concepts existing in the databases.That is, not all con-
cepts of a database, but only those that can be used to perform searches, are included. These
concepts are described by properties or attributes, and arranged in a tree in order to represent the
relationships among them. For our domain of interest, digital libraries, we have considered only
the following two types of relationships to be relevant:
Generalization/Specification: It represents the typical "is a" relationship. For instance, an
"Emblem Book" is a "Work."
Description: It can be represented as a "has" relationship. It is used to represent that a con-
cept is described by another (sub)concept. For example, a Work "has" an Edition, which, in
turn, is described by the attributes "year of edition,""publisher," and" promoter."
In our system we deal with two types of trees, designed for two different purposes: Concept
Trees, which are placed in the Mediator, and Mapping Trees, which are placed in the Wrappers
(see ill. 1):
Concept Trees:They are abstractions of the schemas of all component databases.The root
concept of a Concept Tree is the object (concept) that can be retrieved by a query.Thus,
there will be as many concept trees as different concepts can be retrieved.When users
express a query, they must decide which concept they want to retrieve, selecting the Con-
cept Tree that will be used to express the query. At present, we have only one concept tree
in our federated system (see ill. 2), which allows works to be retrieved. Concept Trees are
used to generate the user interface, allowing the users to navigate through all the concepts
on it, or to establish query constraints for any of them. A following section below shows
how these concept trees are used to build the user interface where queries are expressed.
Mapping Trees: A Mapping Tree is defined for each document database, and it describes
only the concepts of its associated database. Every concept and attribute in a Mapping Tree
is associated to the expression necessary to access the corresponding data in the associated
database, which is completely dependent on the DBMS. For example, a concept represent-
ed in a Mapping Tree for a relational database can have the relation and attribute names
where the concept is stored, or a more complex expression, like a complete SQL SELECT
statement. If the database is capable of using some kind of Text Retrieval technique, the
information associated with a concept like "content" or "topic" will be the directions to
101
DC_Emblemsbook_180204 19.02.2004 11:26 Uhr Seite 101