Multi-View Dimensionality Reduction via Canonical Correlation Analysis

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Foster, Dean P
Kakade, Sham M
Zhang, Tong
Contributor
Abstract

We analyze the multi-view regression problem where we have two views X = (X(1),X(2)) of the input data and a target variable Y of interest. We provide sufficient conditions under which we can reduce the dimensionality of X (via a projection) without loosing predictive power of Y . Crucially, this projection can be computed via a Canonical Correlation Analysis only on the unlabeled data. The algorithmic template is as follows: with unlabeled data, perform CCA and construct a certain projection; with the labeled data, do least squares regression in this lower dimensional space. We show how, under certain natural assumptions, the number of labeled samples could be significantly reduced (in comparison to the single view setting) — in particular, we show how this dimensionality reduction does not loose predictive power of Y (thus it only introduces little bias but could drastically reduce the variance). We explore two separate assumptions under which this is possible and show how, under either assumption alone, dimensionality reduction could reduce the labeled sample complexity. The two assumptions we consider are a conditional independence assumption and a redundancy assumption. The typical conditional independence assumption is that conditioned on Y the views X(1) and X(2) are independent — we relax this assumption to be conditioned on some hidden state H the views X(1) and X(2) are independent. Under the redundancy assumption, we have that the best predictor from each view is roughly as good as the best predictor using both views.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2008-01-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Toyota Technical Institute-Chicago, Technical Report TTI-TR-2008-4 At the time of publication, author Sham M. Kakade was affiliated with Toyota Technological Institute at Chicago. Currently (August 2016), he is a faculty member at the Statistics Department at the University of Pennsylvania.
Recommended citation
Collection