Metric learning with convex optimization

Kilian Quirin Weinberger, University of Pennsylvania


Many machine learning algorithms rely heavily on the existence of a good measure of (dis-)similarity between input vectors. One of the most commonly used measures of dissimilarity is the Euclidean distance in input space. This is often suboptimal in many ways. The Euclidean distance metric does not incorporate any side-information that might be available and it does not take advantage of the data structure or specifics of the machine learning goals. Ideally a metric should be learned for each specific task. Recent advances in numerical optimization provide us with a powerful tool for metric learning (and machine learning in general): Convex optimization. I will investigate two approaches to metric learning based on convex optimization for two different data scenarios: The first algorithm, Large Margin Nearest Neighbor (LMNN), operates in a supervised scenario. LMNN learns a metric specifically to improve k-nearest neighbors classification. This is achieved through a linear transformation of the input data that moves similarly labeled inputs close together and separates differently labeled inputs by a large margin. LMNN can be written as a semidefinite program that could be applied to large data sets with up to 60000 training samples. The second algorithm, Maximum Variance Unfolding (MVU), is designed for an unsupervised scenario. The algorithm finds a low dimensional Euclidean embedding of the data that preserves local distances while globally maximizing the variance. Similar to LMNN, MVU can also be phrased as a semidefinite program. This formulation gives local guarantees and distinguishes the algorithm from prior work.

Subject Area

Computer science

Recommended Citation

Weinberger, Kilian Quirin, "Metric learning with convex optimization" (2007). Dissertations available from ProQuest. AAI3271830.