Statistics Papers

Document Type

Journal Article

Date of this Version

2013

Publication Source

Annals of Statistics

Volume

41

Issue

2

Start Page

772

Last Page

801

DOI

10.1214/13-AOS1097

Abstract

.

Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

Keywords

dimension reduction, high-dimensional statistics, principal component analysis, principal subspace, sparsity, spiked covariance model, thresholding

Share

COinS
 

Date Posted: 27 November 2017

This document has been peer reviewed.