Structured Matrix Completion with Applications to Genomic Data Integration

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
constrained minimization
genomic data integration
low-rank matrix
matrix completion
singular value decomposition
structured matrix completion
Applied Statistics
Business
Genetics and Genomics
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Cai, Tianxi
Cai, T. Tony
Zhang, Anru
Contributor
Abstract

Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2016-08-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection