Sparse Principal Component Analysis and Iterative Thresholding

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
dimension reduction
high-dimensional statistics
principal component analysis
principal subspace
sparsity
spiked covariance model
thresholding
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Ma, Zongming
Contributor
Abstract

. Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2013-01-01
Journal title
Annals of Statistics
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection