Aspects of Partial Information in Databases

Libkin, Leonid

Aspects of Partial Information in Databases

Files

94_10.pdf (1.58 MB)

Permalink

https://repository.upenn.edu/handle/20.500.14332/37571

View all metadata

Author

Libkin, Leonid

Abstract

Information stored in databases is usually incomplete. Typical sources of partiality are missing information, conflicts that occur when databases are merged, and asking queries against several databases simultaneously. The field of partial information in databases has not received the attention that it deserves. Most work on partial information in databases asks which operations of standard languages, like relational algebra, can still be performed correctly in the presence of simple forms of partial information. We believe that the problem should be looked at from another point of view: the semantics of partiality must be clearly understood and it should give us new design principles for languages for databases with partial information. The main goals of this thesis are to develop new analytical tools for studying partial information and its semantics, and to use the semantics of partiality as the basis for design of query languages. Unlike typical research in artificial intelligence, we concentrate on general purpose solutions that are effectively implementable in the context of database query languages and provide a flexible basis for future modeling challenges. We present a common semantic framework for various kinds of partial information which can be applied in a context more general than the flat relational model. This semantics is based on the idea of ordering objects in terms of being more informative. Such ordered semantics cleanly integrates all kinds of partial information and serves as a tool to establish connections between them. By analyzing mathematical properties of partial data, it is possible to find operations naturally associated with it. Such operations, arising from characterization of semantic domains of types as free algebras, can be turned into programming language constructs. We discuss languages for databases with partial information that are given rise to by the semantics. A language for sets and or-sets is introduced and normalization theorem is proved. It allows to incorporate semantics into the language and to distinguish two levels of querying: structural and conceptual. This language has been implemented on top of Standard ML, and shown to be useful in problems of querying independent and incomplete databases.

Date of degree

1994-08-01

Comments

University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-94-10.

Collection

Dissertations and Theses