Taskar, Ben
Email Address
ORCID
Disciplines
Search Results
Now showing 1 - 6 of 6
Publication Movie/Script: Alignment and Parsing of Video and Text Transcription(2008-10-01) Cour, Timothee; Jordan, Chris; Miltsakaki, Eleni; Taskar, BenMovies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.Publication Multi-View Learning over Structured and Non-Identical Outputs(2008-07-09) Ganchev, Kuzman; Graca, Joao V; Taskar, Ben; Blitzer, JohnIn many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is nearly sufficient in determining the correct labels. In this paper we present a new algorithm for probabilistic multi-view learning which uses the idea of stochastic agreement between views as regularization. Our algorithm works on structured and unstructured problems and easily generalizes to partial agreement scenarios. For the full agreement case, our algorithm minimizes the Bhattacharyya distance between the models of each view, and performs better than CoBoosting and two-view Perceptron on several at and structured classification problems.Publication Better Alignments = Better Translations?(2008-06-16) Ganchev, Kuzman; Taskar, Ben; Graca, Joao VAutomatic word alignment is a key step in training statistical machine translation systems. Despite much recent work on word alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality. In this work we analyze a recently proposed agreement-constrained EM algorithm for unsupervised alignment models. We attempt to tease apart the effects that this simple but effective modification has on alignment precision and recall trade-offs, and how rare and common words are affected across several language pairs. We propose and extensively evaluate a simple method for using alignment models to produce alignments better-suited for phrase-based MT systems, and show significant gains (as measured by BLEU score) in end-to-end translation systems for six languages pairs used in recent MT competitions.Publication Joint covariate selection and joint subspace selection for multiple classification problems(2009-01-14) Obozinski, Guillaume; Taskar, Ben; Jordan, Michael IWe address the problem of recovering a common set of covariates that are relevant simultaneously to several classification problems. By penalizing the sum of ℓ2-norms of the blocks of coefficients associated with each covariate across different classification problems, similar sparsity patterns in all models are encouraged. To take computational advantage of the sparsity of solutions at high regularization levels, we propose a blockwise path-following scheme that approximately traces the regularization path. As the regularization coefficient decreases, the algorithm maintains and updates concurrently a growing set of covariates that are simultaneously active for all problems. We also show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace. We present theoretical results showing that this random projection approach converges to the solution yielded by trace-norm regularization. Finally, we present a variety of experimental results exploring joint covariate selection and joint subspace selection, comparing the path-following approach to competing algorithms in terms of prediction accuracy and running time.Publication Online, self-supervised terrain classification via discriminatively trained submodular Markov random fields(2008-05-19) Vernaza, Paul; Taskar, Ben; Lee, Daniel DThe authors present a novel approach to the task of autonomous terrain classification based on structured prediction. We consider the problem of learning a classifier that will accurately segment an image into "obstacle" and "ground" patches based on supervised input. Previous approaches to this problem have focused mostly on local appearance; typically, a classifier is trained and evaluated on a pixel-by-pixel basis, making an implicit assumption of independence in local pixel neighborhoods. We relax this assumption by modeling correlations between pixels in the submodular MRF framework. We show how both the learning and inference tasks can be simply and efficiently implemented-exact inference via an efficient max flow computation; and learning, via an averaged-subgradient method. Unlike most comparable MRF-based approaches, our method is suitable for implementation on a robot in real-time. Experimental results are shown that demonstrate a marked increase in classification accuracy over standard methods in addition to real-time performance.Publication Dependency Grammar Induction via Bitext Projection Constraints(2009-08-02) Ganchev, Kuzman; Gillenwater, Jennifer; Taskar, BenBroad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees. Unlike previous approaches, our framework does not require full projected parses, allowing partial, approximate transfer through linear expectation constraints on the space of distributions over trees. We consider several types of constraints that range from generic dependency conservation to language-specific annotation rules for auxiliary verb analysis. We evaluate our approach on Bulgarian and Spanish CoNLL shared task data and show that we consistently outperform unsupervised methods and can outperform supervised learning for limited training data.