Ganchev, Kuzman

Email Address
Research Projects
Organizational Units
Research Interests

Search Results

Now showing 1 - 5 of 5
  • Publication
    Posterior Regularization for Learning with Side Information and Weak Supervision
    (2010-12-22) Ganchev, Kuzman
    Supervised machine learning techniques have been very successful for a variety of tasks and domains including natural language processing, computer vision, and computational biology. Unfortunately, their use often requires creation of large problem-specific training corpora that can make these methods prohibitively expensive. At the same time, we often have access to external problem-specific information that we cannot alway easily incorporate. We might know how to solve the problem in another domain (e.g. for a different language); we might have access to cheap but noisy training data; or a domain expert might be available who would be able to guide a human learner much more efficiently than by simply creating an IID training corpus. A key challenge for weakly supervised learning is then how to incorporate such kinds of auxiliary information arising from indirect supervision. In this thesis, we present Posterior Regularization, a probabilistic framework for structured, weakly supervised learning. Posterior Regularization is applicable to probabilistic models with latent variables and exports a language for specifying constraints or preferences about posterior distributions of latent variables. We show that this language is powerful enough to specify realistic prior knowledge for a variety applications in natural language processing. Additionally, because Posterior Regularization separates model complexity from the complexity of structural constraints, it can be used for structured problems with relatively little computational overhead. We apply Posterior Regularization to several problems in natural language processing including word alignment for machine translation, transfer of linguistic resources across languages and grammar induction. Additionally, we find that we can apply Posterior Regularization to the problem of multi-view learning, achieving particularly good results for transfer learning. We also explore the theoretical relationship between Posterior Regularization and other proposed frameworks for encoding this kind of prior knowledge, and show a close relationship to Constraint Driven Learning as well as to Generalized Expectation Constraints.
  • Publication
    Better Alignments = Better Translations?
    (2008-06-16) Ganchev, Kuzman; Taskar, Ben; Graca, Joao V
    Automatic word alignment is a key step in training statistical machine translation systems. Despite much recent work on word alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality. In this work we analyze a recently proposed agreement-constrained EM algorithm for unsupervised alignment models. We attempt to tease apart the effects that this simple but effective modification has on alignment precision and recall trade-offs, and how rare and common words are affected across several language pairs. We propose and extensively evaluate a simple method for using alignment models to produce alignments better-suited for phrase-based MT systems, and show significant gains (as measured by BLEU score) in end-to-end translation systems for six languages pairs used in recent MT competitions.
  • Publication
    Multi-View Learning over Structured and Non-Identical Outputs
    (2008-07-09) Ganchev, Kuzman; Graca, Joao V; Taskar, Ben; Blitzer, John
    In many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is nearly sufficient in determining the correct labels. In this paper we present a new algorithm for probabilistic multi-view learning which uses the idea of stochastic agreement between views as regularization. Our algorithm works on structured and unstructured problems and easily generalizes to partial agreement scenarios. For the full agreement case, our algorithm minimizes the Bhattacharyya distance between the models of each view, and performs better than CoBoosting and two-view Perceptron on several at and structured classification problems.
  • Publication
    Learning Tractable Word Alignment Models with Complex Constraints
    (2010-03-10) Ganchev, Kuzman; Graça, João V.; Taskar, Ben
    Word-level alignment of bilingual text is a critical resource for a growing variety of tasks. Probabilistic models for word alignment present a fundamental trade-off between richness of captured constraints and correlations versus efficiency and tractability of inference. In this article, we use the Posterior Regularization framework (Graça, Ganchev, and Taskar 2007) to incorporate complex constraints into probabilistic models during learning without changing the efficiency of the underlying model. We focus on the simple and tractable hidden Markov model, and present an efficient learning algorithm for incorporating approximate bijectivity and symmetry constraints. Models estimated with these constraints produce a significant boost in performance as measured by both precision and recall of manually annotated alignments for six language pairs. We also report experiments on two different tasks where word alignments are required: phrase-based machine translation and syntax transfer, and show promising improvements over standard methods.
  • Publication
    Dependency Grammar Induction via Bitext Projection Constraints
    (2009-08-02) Ganchev, Kuzman; Gillenwater, Jennifer; Taskar, Ben
    Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees. Unlike previous approaches, our framework does not require full projected parses, allowing partial, approximate transfer through linear expectation constraints on the space of distributions over trees. We consider several types of constraints that range from generic dependency conservation to language-specific annotation rules for auxiliary verb analysis. We evaluate our approach on Bulgarian and Spanish CoNLL shared task data and show that we consistently outperform unsupervised methods and can outperform supervised learning for limited training data.