Sharing Work in Keyword Search Over Databases

Jacobs, Marie; Ives, Zachary G

Sharing Work in Keyword Search Over Databases

Files

2011_sharing_work_in_keyword_search_over_databases.pdf (1.55 MB)

Penn collection

Departmental Papers (CIS)

Subject

Computer Sciences

Permalink

https://repository.upenn.edu/handle/20.500.14332/6683

View all metadata

Author

Jacobs, Marie

Ives, Zachary G

Abstract

An important means of allowing non-expert end-users to pose ad hoc queries — whether over single databases or data integration systems—is through keyword search. Given a set of keywords, the query processor finds matches across different tuples and tables. It computes and executes a set of relational sub-queries whose results are combined to produce the k highest ranking answers. Work on keyword search primarily focuses on single-database, single-query settings: each query is answered in isolation, despite possible overlap between queries posed by different users or at different times; and the number of relevant tables is assumed to be small, meaning that sub-queries can be processed without using cost-based methods to combine work. As we apply keyword search to support ad hoc data integration queries over scientific or other databases on the Web, we must reuse and combine computation. In this paper, we propose an architecture that continuously receives sets of ranked keyword queries, and seeks to reuse work across these queries. We extend multiple query optimization and continuous query techniques, and develop a new query plan scheduling module we call the ATC (based on its analogy to an air traffic controller). The ATC manages the flow of tuples among a multitude of pipelined operators, minimizing the work needed to return the top-k answers for all queries. We also develop techniques to manage the sharing and reuse of state as queries complete and input data streams are exhausted. We show the effectiveness of our techniques in handling queries over real and synthetic data sets.

Date of presentation

2011-01-01

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-17T07:12:44.000

Comments

Jacob, M., & Ives, Z., Sharing Work in Keyword Search Over Databases, ACM SIGMOD International Conference on Management of Data, June 2011, doi: http://doi.acm.org/10.1145/1989323.1989384 ACM COPYRIGHT NOTICE. Copyright © 2011 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or permissions@acm.org.

Collection

Presentations