
Statistics Papers
Document Type
Technical Report
Date of this Version
9-2016
Publication Source
Journal of Multivariate Analysis
Volume
150
Start Page
55
Last Page
74
DOI
10.1016/j.jmva.2016.05.002
Abstract
Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated.
Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data.
Copyright/Permission Statement
© 2016. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
Keywords
adaptive thresholding, bandable covariance matrix, generalized sample covariance matrix, missing data, optimal rate of convergence, sparse convergence matrix, thresholding
Recommended Citation
Cai, T., & Zhang, A. (2016). Minimax Rate-Optimal Estimation of High-Dimensional Covariance Matrices with Incomplete Data. Journal of Multivariate Analysis, 150 55-74. http://dx.doi.org/10.1016/j.jmva.2016.05.002
Included in
Business Analytics Commons, Management Sciences and Quantitative Methods Commons, Mathematics Commons, Statistics and Probability Commons
Date Posted: 25 October 2018
This document has been peer reviewed.