Minimax Rate-Optimal Estimation of High-Dimensional Covariance Matrices with Incomplete Data

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
adaptive thresholding
bandable covariance matrix
generalized sample covariance matrix
missing data
optimal rate of convergence
sparse convergence matrix
thresholding
Business
Business Analytics
Management Sciences and Quantitative Methods
Mathematics
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Cai, T. Tony
Zhang, Anru
Contributor
Abstract

Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2016-09-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection