Revisiting Readability: A Unified Framework for Predicting Text Quality

Pitler, Emily; Nenkova, Ani

Revisiting Readability: A Unified Framework for Predicting Text Quality

Files

2008_revisiting_readability__a_unified_framework_for_predicting_text_quality.pdf (141.17 KB)

Penn collection

Departmental Papers (CIS)

Subject

Computer Sciences

Permalink

https://repository.upenn.edu/handle/20.500.14332/6793

View all metadata

Author

Pitler, Emily

Nenkova, Ani

Abstract

We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers’ judgments of text readability. This is the first study to take into account such a variety of linguistic factors and the first to empirically demonstrate that discourse relations are strongly associated with the perceived quality of text. We show that various surface metrics generally expected to be related to readability are not very good predictors of readability judgments in our Wall Street Journal corpus. We also establish that readability predictors behave differently depending on the task: predicting text readability or ranking the readability. Our experiments indicate that discourse relations are the one class of features that exhibits robustness across these two tasks.

Date of presentation

2008-10-01

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-17T07:17:10.000

Comments

Pitler, E. & Nenkova, A., Revisiting Readability: A Unified Framework for Predicting Text Quality, Conference on Empirical Methods in Natural Language Processing, Oct. 2008, doi: http://www.aclweb.org/anthology/D08-1020

Collection

Presentations