Sparsity in Dependency Grammar Induction

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Computer Sciences
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Gillenwater, Jennifer
Ganchev, Kuzman
Graca, Joao V
Pereira, Fernando
Contributor
Abstract

A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments with 12 languages, we achieve substantial gains over the standard expectation maximization (EM) baseline, with average improvement in attachment accuracy of 6.3%. Further, our method outperforms models based on a standard Bayesian sparsity-inducing prior by an average of 4.9%. On English in particular, we show that our approach improves on several other state-of-the-art techniques.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2010-07-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Sparsity in Dependency Grammar Induction (http://www.cis.upenn.edu/%7Etaskar/pubs/acl10.pdf), J. Gillenwater (http://www.seas.upenn.edu/%7Ejengi), K. Ganchev (http://www.seas.upenn.edu/%7Ekuzman/), J. Graca (http://www.cis.upenn.edu/%7Egraca/), F. Pereira (http://www.cis.upenn.edu/%7Epereira), and B. Taskar. Association for Computational Linguistics (ACL) (http://acl2010.org/), Uppsala, Sweden, July 2010. © 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Recommended citation
Collection