One of the most important recent developments in
computer science research has been the convergence of XML and database
technologies. XML has become a standard format for data interchanges
over the Internet - especially in the electronic commerce. The beauty
of XML lies in its flexibility and while XML offers new opportunities
for improved information sharing between business partners regardless
of platform or legacy application, the relational model, on which
most current database management systems are based on, provides
a solid theoretical framework within which a variety of important
problems could be attacked in a scientific manner. The potential
advantages of using an already matured technology of Relational
Database Managements (RDBMS) are well-understood. Having XML data
indexed and stored in a conventional RDBMS provides us an opportunity
to manipulate various functionalities supported by this mature technology
for effective and efficient processing of the data.
My recent research is centred in the field of structured
documents described by XML and in particular in integration of relational
or object-relational database technologies and XML. We are developing
a theoretical framework for partial retrieval of tree-structured
documents described by XML. While extensive studies have been carried
out on structure-based queries, relatively little work has been
done on a formal framework for keyword-based queries. Naive users
typically query documents with keywords and it is important that
database-style formal approaches to keyword-based queries are taken
into consideration to ensure definitive results. We have made a
significant progress in defining a novel fragmentary-join that would
enable end-users to specify keyword-based queries and retrieve desired
XML fragments. The key idea is to enable users to retrieve all XML
fragments that can be potential answers to a query. In addition,
we have proposed an implementation framework that enables the proposed
algebra to be transformed into usual SQL expressions when XML data
is encoded and actually stored in an existing object-relational
database.
|