XML and Database Technology


One of the most important recent developments in computer science research has been the convergence of XML and database technologies. XML has become a standard format for data interchanges over the Internet - especially in the electronic commerce. The beauty of XML lies in its flexibility and while XML offers new opportunities for improved information sharing between business partners regardless of platform or legacy application, the relational model, on which most current database management systems are based on, provides a solid theoretical framework within which a variety of important problems could be attacked in a scientific manner. The potential advantages of using an already matured technology of Relational Database Managements (RDBMS) are well-understood. Having XML data indexed and stored in a conventional RDBMS provides us an opportunity to manipulate various functionalities supported by this mature technology for effective and efficient processing of the data.

My recent research is centred in the field of structured documents described by XML and in particular in integration of relational or object-relational database technologies and XML. We are developing a theoretical framework for partial retrieval of tree-structured documents described by XML. While extensive studies have been carried out on structure-based queries, relatively little work has been done on a formal framework for keyword-based queries. Naive users typically query documents with keywords and it is important that database-style formal approaches to keyword-based queries are taken into consideration to ensure definitive results. We have made a significant progress in defining a novel fragmentary-join that would enable end-users to specify keyword-based queries and retrieve desired XML fragments. The key idea is to enable users to retrieve all XML fragments that can be potential answers to a query. In addition, we have proposed an implementation framework that enables the proposed algebra to be transformed into usual SQL expressions when XML data is encoded and actually stored in an existing object-relational database.