
Departmental Papers (CIS)
Document Type
Conference Paper
Date of this Version
3-2009
Abstract
We address the task of automatically predicting if summarization system performance will be good or bad based on features derived directly from either single- or multi-document inputs. Our labelled corpus for the task is composed of data from large scale evaluations completed over the span of several years. The variation of data between years allows for a comprehensive analysis of the robustness of features, but poses a challenge for building a combined corpus which can be used for training and testing. Still, we find that the problem can be mitigated by appropriately normalizing for differences within each year. We examine different formulations of the classification task which considerably influence performance. The best results are 84% prediction accuracy for single- and 74% for multi-document summarization.
Date Posted: 30 July 2012

Comments
Louis, A. & Nenkova, A., Performance Confidence Estimation for Automatic Summarization, 12th Conference of the European Chapter of the Association for the Computational Linguistics, March-April 2009, doi: anthology/E09-1062