Structural Features for Predicting the Linguistic Quality of Text: Applications to Machine Translation, Automatic Summarization and Human-Authored Text

Nenkova, Ani; Chae, Jieun; Louis, Annie; Pitler, Emily

Structural Features for Predicting the Linguistic Quality of Text: Applications to Machine Translation, Automatic Summarization and Human-Authored Text

dc.contributor.author	Nenkova, Ani
dc.contributor.author	Chae, Jieun
dc.contributor.author	Louis, Annie
dc.contributor.author	Pitler, Emily
dc.date	2023-05-17T07:16:54.000
dc.date.accessioned	2023-05-22T12:50:36Z
dc.date.available	2023-05-22T12:50:36Z
dc.date.issued	2010-01-01
dc.date.submitted	2012-07-30T12:09:55-07:00
dc.description.abstract	Sentence structure is considered to be an important component of the overall linguistic quality of text. Yet few empirical studies have sought to characterize how and to what extent structural features determine fluency and linguistic quality. We report the results of experiments on the predictive power of syntactic phrasing statistics and other structural features for these aspects of text. Manual assessments of sentence fluency for machine translation evaluation and text quality for summarization evaluation are used as gold-standard. We find that many structural features related to phrase length are weakly but significantly correlated with fluency and classifiers based on the entire suite of structural features can achieve high accuracy in pairwise comparison of sentence fluency and in distinguishing machine translations from human translations. We also test the hypothesis that the learned models capture general fluency properties applicable to human-authored text. The results from our experiments do not support the hypothesis. At the same time structural features and models based on them prove to be robust for automatic evaluation of the linguistic quality of multi-document summaries.
dc.description.comments	Nenkova, A., Chae, J., Louis, A., & Pitler, E., Structural Features for Predicting the Linguistic Quality of Text: Applications to Machine Translation, Automatic Summarization and Human-Authored Text, Empirical Methods in Natural Language Generation: Data Oriented Methods and Empirical Evaluation, 2010, doi: http://dx.doi.org/10.1007/978-3-642-15573-4_12
dc.identifier.uri	https://repository.upenn.edu/handle/20.500.14332/6782
dc.legacy.articleid	1755
dc.legacy.fulltexturl	https://repository.upenn.edu/cgi/viewcontent.cgi?article=1755&context=cis_papers&unstamped=1
dc.source.issue	715
dc.source.journal	Departmental Papers (CIS)
dc.source.status	published
dc.subject.other	Computer Sciences
dc.title	Structural Features for Predicting the Linguistic Quality of Text: Applications to Machine Translation, Automatic Summarization and Human-Authored Text
dc.type	Presentation
digcom.identifier	cis_papers/715
digcom.identifier.contextkey	3160101
digcom.identifier.submissionpath	cis_papers/715
digcom.type	conference
dspace.entity.type	Publication
relation.isAuthorOfPublication	ce994fec-4416-487d-9b37-34b89c8f432b
relation.isAuthorOfPublication.latestForDiscovery	ce994fec-4416-487d-9b37-34b89c8f432b
upenn.schoolDepartmentCenter	Departmental Papers (CIS)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2010_structural_features_for_predicting_the_linguistic_quality_of_text__applications_to_machine_translation__automatic_summarization_and_human_authored_text.pdf
Size:: 250.35 KB
Format:: Adobe Portable Document Format

Download

Collection

Presentations

Structural Features for Predicting the Linguistic Quality of Text: Applications to Machine Translation, Automatic Summarization and Human-Authored Text

Files

Original bundle

Collection

Usage statistics

Penn's Heritage