Revisiting Readability: A Unified Framework for Predicting Text Quality

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Computer Sciences
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Pitler, Emily
Contributor
Abstract

We combine lexical, syntactic, and discourse features to produce a highly predictive model of human readers’ judgments of text readability. This is the first study to take into account such a variety of linguistic factors and the first to empirically demonstrate that discourse relations are strongly associated with the perceived quality of text. We show that various surface metrics generally expected to be related to readability are not very good predictors of readability judgments in our Wall Street Journal corpus. We also establish that readability predictors behave differently depending on the task: predicting text readability or ranking the readability. Our experiments indicate that discourse relations are the one class of features that exhibits robustness across these two tasks.

Advisor
Date of presentation
2008-10-01
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-17T07:17:10.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Pitler, E. & Nenkova, A., Revisiting Readability: A Unified Framework for Predicting Text Quality, Conference on Empirical Methods in Natural Language Processing, Oct. 2008, doi: http://www.aclweb.org/anthology/D08-1020
Recommended citation
Collection