Unsupervised Models of Text Structure

Loading...
Thumbnail Image
Penn collection
Technical Reports (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract

Models of text structure are necessary for applications that generate text. These models provide information about what content fits together and how to organize the content as coherent text. In some domains such as newswire, biographies and stories for children, texts tend to have similar content and structure. Such regularities have allowed the development of unsupervised methods to learn text structure using human-written examples from such domains. We survey some of the recently proposed approaches in this area and review their use in different text generation tasks. First, we consider approaches with a focus on computational semantics. We review work aiming to discover patterns of related events from news articles and children’s stories. We consider one application of such knowledge–an automatic story-telling system. Next, we move to methods which focus on coherence and organization. We describe these in the context of two generation tasks–sentence ordering and the creation of long articles. In view of the sentence ordering problem, we survey approaches targeted at learning properties of coherent transitions between adjacent sentences in texts. Then, we consider the generation of long biographical descriptions. Here we survey recent work on automatically generating such articles using higher level patterns in text structure such as subtopics and their organization.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2011-01-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-11-16.
Recommended citation
Collection