An Empirical Comparison of Probability Models for Dependency Grammar
Files
Penn collection
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
This technical report is an appendix to Eisner (1996): it gives superior experimental results that were reported only in the talk version of that paper, with details of how the results were obtained. Eisner (1996) trained three probability models on a small set of about 4,000 conjunction-free, dependency grammar parses derived from the Wall Street Journal section of the Penn Treebank, and then evaluated the models on a held-out test set, using a novel O(n3) parsing algorithm. The present paper describes some details of the experiments and repeats them with a larger training set of 25,000 sentences. As reported at the talk, the more extensive training yields greatly improved performance, cutting in half the error rate of Eisner (1996). Nearly half the sentences are parsed with no misattachments; two-thirds of sentences are parsed with at most one misattachment. Of the models described in the original paper, the best score is obtained with the generative "model C", which attaches 87-88% of all words to the correct parent. However, better models are also explored, in particular, two simple variants on the comprehension "model B." The better of these has an attachment accuracy of 93% and (unlike model C) tags words more accurately than the comparable trigram tagger. If tags are roughly known in advance, search error is all but eliminated and the new model attains an attachment accuracy of 93%. We find that the parser of Collins (1996) when combined with a highly, trained tagger, also achieves 93% when trained and tested on the same sentences. We briefly discuss the similarities and differences between Collins's model and ours, pointing out the strengths of each and noting that these strengths could be combined for either dependency parsing or phrase-structure parsing.