Querying Nested Collections

dc.contributor.authorWong, Limsoon
dc.date2023-05-16T23:50:19.000
dc.date.accessioned2023-05-22T19:57:46Z
dc.date.available2023-05-22T19:57:46Z
dc.date.issued1994-05-01
dc.date.submitted2006-09-11T10:03:51-07:00
dc.description.abstractThis dissertation investigates a new approach to query languages inspired by structural recursion and by the categorical notion of a monad. A language based on these principles has been designed and studied. It is found to have the strength of several widely known relational languages but without their weaknesses. This language and its various extensions are shown to exhibit a conservative extension property, which indicates that the depth of nesting of collections in intermediate data has no effect on their expressive power. These languages also exhibit the finite-cofiniteness property on many classes of queries. These two properties provide easy answers to several hitherto unresolved conjectures on query languages that are more realistic than the flat relational algebra. A useful rewrite system has been derived from the equational theory of monads. It forms the core of a source-to-source optimizer capable of performing filter promotion, code motion, and loop fusion. Scanning routines and printing routines are considered as part of optimization process. An operational semantics that is a blending of eager evaluation and lazy evaluation is suggested in conjunction with these input-output routines. This strategy leads to a reduction in space consumption and a faster response time while preserving good total time performance. Additional optimization rules have been systematically introduced to cache and index small relations, to map monad operations to several classical join operators, to cache large intermediate relations, and to push monad operations to external servers. A query system Kleisli and a high-level query language CPL for it have been built on top of the functional language ML. Many of my theoretical and practical contributions have been physically realized in Kleisli and CPL. In addition, I have explored the idea of open system in my implementation. Dynamic extension of the system with new primitives, cost functions, optimization rules, scanners, and writers are fully supported. As a consequence, my system can be easily connected to external data sources. In particular, it has been successfully applied to integrate several genetic data sources which include relational databases, structured files, as well as data generated by special application programs.
dc.description.commentsUniversity of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-94-09.
dc.identifier.urihttps://repository.upenn.edu/handle/20.500.14332/37570
dc.legacy.articleid1155
dc.legacy.fulltexturlhttps://repository.upenn.edu/cgi/viewcontent.cgi?article=1155&context=ircs_reports&unstamped=1
dc.source.issue155
dc.source.journalIRCS Technical Reports Series
dc.source.statuspublished
dc.titleQuerying Nested Collections
dc.typeDissertation/Thesis
digcom.contributor.authorWong, Limsoon
digcom.identifierircs_reports/155
digcom.identifier.contextkey204162
digcom.identifier.submissionpathircs_reports/155
digcom.typethesis
dspace.entity.typePublication
upenn.schoolDepartmentCenterIRCS Technical Reports Series
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
94_09.pdf
Size:
1.16 MB
Format:
Adobe Portable Document Format