Sparsity in Dependency Grammar Induction

Gillenwater, Jennifer; Ganchev, Kuzman; Graca, Joao V; Pereira, Fernando; Taskar, Ben

Sparsity in Dependency Grammar Induction

Files

acl10.pdf (282.21 KB)

Penn collection

Departmental Papers (CIS)

Subject

Computer Sciences

Permalink

https://repository.upenn.edu/handle/20.500.14332/6606

View all metadata

Author

Gillenwater, Jennifer

Ganchev, Kuzman

Graca, Joao V

Pereira, Fernando

Taskar, Ben

Abstract

A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments with 12 languages, we achieve substantial gains over the standard expectation maximization (EM) baseline, with average improvement in attachment accuracy of 6.3%. Further, our method outperforms models based on a standard Bayesian sparsity-inducing prior by an average of 4.9%. On English in particular, we show that our approach improves on several other state-of-the-art techniques.

Publication date

2010-07-01

Comments

Sparsity in Dependency Grammar Induction (http://www.cis.upenn.edu/%7Etaskar/pubs/acl10.pdf), J. Gillenwater (http://www.seas.upenn.edu/%7Ejengi), K. Ganchev (http://www.seas.upenn.edu/%7Ekuzman/), J. Graca (http://www.cis.upenn.edu/%7Egraca/), F. Pereira (http://www.cis.upenn.edu/%7Epereira), and B. Taskar. Association for Computational Linguistics (ACL) (http://acl2010.org/), Uppsala, Sweden, July 2010. © 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Collection

Reports