Word Alignment via Quadratic Assignment

Lacoste-Julien, Simon; Taskar, Ben; Klein, Dan; Jordan, Michael

Word Alignment via Quadratic Assignment

dc.contributor.author	Lacoste-Julien, Simon
dc.contributor.author	Taskar, Ben
dc.contributor.author	Klein, Dan
dc.contributor.author	Jordan, Michael
dc.date	2023-05-17T07:09:31.000
dc.date.accessioned	2023-05-22T12:49:16Z
dc.date.available	2023-05-22T12:49:16Z
dc.date.issued	2006-01-01
dc.date.submitted	2012-07-16T09:52:12-07:00
dc.description.abstract	Recently, discriminative word alignment methods have achieved state-of-the-art accuracies by extending the range of information sources that can be easily incorporated into aligners. The chief advantage of a discriminative framework is the ability to score alignments based on arbitrary features of the matching word tokens, including orthographic form, predictions of other models, lexical context and so on. However, the proposed bipartite matching model of Taskar et al. (2005), despite being tractable and effective, has two important limitations. First, it is limited by the restriction that words have fertility of at most one. More importantly, first order correlations between consecutive words cannot be directly captured by the model. In this work, we address these limitations by enriching the model form. We give estimation and inference algorithms for these enhancements. Our best model achieves a relative AER reduction of 25% over the basic matching formulation, outperforming intersected IBM Model 4 without using any overly compute-intensive features. By including predictions of other models as features, we achieve AER of 3:8 on the standard Hansards dataset.
dc.description.comments	Simon Lacoste-Julien, Ben Taskar, Dan Klein, and Michael I. Jordan. 2006. Word alignment via quadratic assignment. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL '06). Association for Computational Linguistics, Stroudsburg, PA, USA, 112-119. DOI=10.3115/1220835.1220850 http://dx.doi.org/10.3115/1220835.1220850 © ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, {(2006)} http://doi.acm.org/10.3115/1220835.1220850" Email permissions@acm.org
dc.identifier.uri	https://repository.upenn.edu/handle/20.500.14332/6591
dc.legacy.articleid	1568
dc.legacy.fulltexturl	https://repository.upenn.edu/cgi/viewcontent.cgi?article=1568&context=cis_papers&unstamped=1
dc.source.issue	532
dc.source.journal	Departmental Papers (CIS)
dc.source.status	published
dc.subject.other	Computer Sciences
dc.title	Word Alignment via Quadratic Assignment
dc.type	Presentation
digcom.identifier	cis_papers/532
digcom.identifier.contextkey	3097439
digcom.identifier.submissionpath	cis_papers/532
digcom.type	conference
dspace.entity.type	Publication
relation.isAuthorOfPublication	48084f74-55a3-43da-96d7-8a01c512b3b9
relation.isAuthorOfPublication.latestForDiscovery	48084f74-55a3-43da-96d7-8a01c512b3b9
upenn.schoolDepartmentCenter	Departmental Papers (CIS)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: naacl06_qap.pdf
Size:: 127.98 KB
Format:: Adobe Portable Document Format

Download

Collection

Presentations

Word Alignment via Quadratic Assignment

Files

Original bundle

Collection

Usage statistics

Penn's Heritage