Learning from Multiple Sources

We consider the problem of learning accurate models from multiple sources of "nearby" data. Given distinct samples from multiple data sources and estimates of the dissimilarities between these sources, we provide a general theory of which samples should be used to learn models for each source. This theory is applicable in a broad decision-theoretic learning framework, and yields general results for classification and regression. A key component of our approach is the development of approximate triangle inequalities for expected loss, which may be of independent interest. We discuss the related problem of learning parameters of a distribution from multiple data sources. Finally, we illustrate our theory through a series of synthetic simulations.

Publication date

2008-06-01

Comments

Copyright 2008 MIT Press. Crammer, K., Kearns, M., and Wortman, J. 2008. Learning from Multiple Sources. J. Mach. Learn. Res. 9 (Jun. 2008), 1757-1774. Publisher URL: http://jmlr.csail.mit.edu/papers/v9/crammer08a.html

Collection

Articles