Information Status Distinctions and Referring Expressions: An Empirical Study of References to People in News Summaries

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Computational Linguistics
Computer Sciences
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Siddharthan, Advaith
McKeown, Kathleen
Contributor
Abstract

Although there has been much theoretical work on using various information status distinctions to explain the form of references in written text, there have been few studies that attempt to automatically learn these distinctions for generating references in the context of computer-regenerated text. In this article, we present a model for generating references to people in news summaries that incorporates insights from both theory and a corpus analysis of human written summaries. In particular, our model captures how two properties of a person referred to in the summary—familiarity to the reader and global salience in the news story—affect the content and form of the initial reference to that person in a summary. We demonstrate that these two distinctions can be learned from a typical input for multi-document summarization and that they can be used to make regeneration decisions that improve the quality of extractive summaries.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2011-03-25
Journal title
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Suggested Citation: Siddharthan, A., Nenkova, A., & McKeown, K. (2011). Information Status Distinctions and Referring Expressions: An Empirical Study of References to People in News Summaries. Computational Linguistics, 37(4), 811-842.
Recommended citation
Collection