Using semantic relations to improve information retrieval

Thomas Morton, University of Pennsylvania


Most approaches to information retrieval have focused primarily on word co-occurrence, or what is typically referred to as a bag-of-words model. While these models do not capture relationships between words other than co-occurrence, they have proved robust and efficient for a large variety of text. More complicated techniques from natural language processing have been largely ineffective when applied to document retrieval but techniques may have more to offer to information retrieval tasks other than document retrieval [Lewis and Sparck Jones, 1996, Voorhees, 1999]. This thesis examines passage retrieval for a question-answering task, and demonstrates that natural language techniques for determining categorical and referential relationships improve passage retrieval. Specifically, the identification of categorical and referential relationships in free text is explored as well as the determination of categorical relationships that hold for the answers to natural language questions. We also examine the integration of these relationships into two passage retrieval systems. Each is evaluated using a question-answering task and performance is markedly improved when categorical relationships are incorporated. Referential relationships improve the performance of one system significantly and produce a small, but insignificant improvement in the second system. Other work has demonstrated improvements in passage retrieval with the addition of categorical relationships [Prager et al., 2000]. This thesis describes experiments which extend the results of [Morton, 1999] and [Morton, 2000] in demonstrating that modeling referential relationships can further improve passage retrieval using a large-scale evaluation.

Subject Area

Computer science|Linguistics

Recommended Citation

Morton, Thomas, "Using semantic relations to improve information retrieval" (2005). Dissertations available from ProQuest. AAI3197718.