Date of this Version
Journal of Multivariate Analysis
Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated.
Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data.
© 2016. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
adaptive thresholding, bandable covariance matrix, generalized sample covariance matrix, missing data, optimal rate of convergence, sparse convergence matrix, thresholding
Cai, T., & Zhang, A. (2016). Minimax Rate-Optimal Estimation of High-Dimensional Covariance Matrices with Incomplete Data. Journal of Multivariate Analysis, 150 55-74. http://dx.doi.org/10.1016/j.jmva.2016.05.002
Date Posted: 25 October 2018
This document has been peer reviewed.