Measuring Importance and Query Relevance in Toopic-Focused Multi-Document Summarization

The increasing complexity of summarization systems makes it difficult to analyze exactly which modules make a difference in performance. We carried out a principled comparison between the two most commonly used schemes for assigning importance to words in the context of query focused multi-document summarization: raw frequency (word probability) and log-likelihood ratio. We demonstrate that the advantages of log-likelihood ratio come from its known distributional properties which allow for the identification of a set of words that in its entirety defines the aboutness of the input. We also find that LLR is more suitable for query-focused summarization since, unlike raw frequency, it is more sensitive to the integration of the information need defined by the user.

Date of presentation

2007-06-01

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-17T07:17:25.000

Comments

Gupta, S., Nenkova, A., & Jurafsky, D., Measuring Importance and Query Relevance in Topic-Focused Multi-Document Summarization, 45th Annual Meeting of the Association for Computational Linguistics, June 2007, doi: http://aclweb.org/anthology-new/P/P07/P07-2049.pdf

Collection

Presentations