Models of Co-occurrence

Melamed, I. Dan

Models of Co-occurrence

Files

98_05.pdf (171 KB)

Penn collection

IRCS Technical Reports Series

Permalink

https://repository.upenn.edu/handle/20.500.14332/37658

View all metadata

Author

Melamed, I. Dan

Abstract

A model of co-occurrence in bitext is a boolean predicate that indicates whether a given pair of word tokens co-occur in corresponding regions of the bitext space. Co-occurrence is a precondition for the possibility that two tokens might be mutual translations. Models of co-occurrence are the glue that binds methods for mapping bitext correspondence with methods for estimating translation models into an integrated system for exploiting parallel texts. Different models of co-occurrence are possible, depending on the kind of bitext map that is available, the language-specific information that is available, and the assumptions made about the nature of translational equivalence. Although most statistical translation models are based on models of co-occurrence, modeling co-occurrence correctly is more difficult than it may at first appear.

Publication date

1998-02-01

Comments

University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-98-05.

Collection

Reports