Technical Reports (CIS)

Document Type

Technical Report

Date of this Version



University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-10-20.


Sentence compression is the task of producing a summary of a single sentence. The compressed sentence should be shorter, contain the important content from the original, and itself be grammatical. The three papers discussed here take different approaches to identifying important content, determining which sentences are grammatical, and jointly optimizing these objectives. One family of approaches we will discuss is those that are tree-based, which create a compressed sentence by making edits to the syntactic tree of the original sentence. A second type of approach is sentence-based, which generates strings directly. Orthogonal to either of these two approaches is whether sentences are treated in isolation or if the surrounding discourse affects compressions. We compare a tree-based, a sentence-based, and a discourse-based approach and conclude with ideas for future work in this area.



Date Posted: 12 May 2010