Technical Reports (CIS)
Document Type
Technical Report
Date of this Version
June 2008
Abstract
We present a corpus study of local discourse relations based on the Penn Discourse Tree Bank, a large manually annotated corpus of explicitly or implicitly realized contingency, comparison, temporal and expansion relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall discourse connectives are mostly unambiguous and allow high accuracy classification of discourse relations. We achieve 93.09% accuracy in classifying the explicit relations and 74.74% accuracy overall. In addition, we show that some pairs of relations occur together in text more often than expected by chance. This finding suggest that global sequence classification of the relations in text can lead to better results, especially for implicit relations.
Recommended Citation
Emily Pitler, Mridhula Raghupathy, Hena Mehta, Ani Nenkova, Alan Lee, and Aravind K. Joshi, "Easily Identifiable Discourse Relations", . June 2008.
Date Posted: 16 June 2008
Comments
University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-08-24.