Graphical Models for Primarily Unsupervised Sequence Labeling

Parikh, Neal; Dredze, Mark

Graphical Models for Primarily Unsupervised Sequence Labeling

Files

MS_CIS_07_18.pdf (270.63 KB)

Penn collection

Technical Reports (CIS)

Permalink

https://repository.upenn.edu/handle/20.500.14332/7584

View all metadata

Author

Parikh, Neal

Dredze, Mark

Abstract

Most models used in natural language processing must be trained on large corpora of labeled text. This tutorial explores a "primarily unsupervised" approach (based on graphical models) that augments a corpus of unlabeled text with some form of prior domain knowledge, but does not require any fully labeled examples. We survey probabilistic graphical models for (supervised) classification and sequence labeling and then present the prototype-driven approach of Haghighi and Klein (2006) to sequence labeling in detail, including a discussion of the theory and implementation of both conditional random fields and prototype learning. We show experimental results for English part of speech tagging.

Publication date

2007-01-01

Comments

University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-07-18.

Collection

Reports