False Discovery Rate Control for High Dimensional Dependent Data with an Application to Large-Scale Genetic Association Studies

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
Physical Sciences and Mathematics
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Xie, Jichun
Cai, Tony
Maris, John
Li, Hongzhe
Contributor
Abstract

Large-scale genetic association studies are increasingly utilized for identifying novel susceptible genetic variants for complex traits, but there is little consensus on analysis methods for such data. Most commonly used methods include single SNP analysis or haplotype analysis with Bonferroni correction for multiple comparisons. Since the SNPs in typical GWAS are often in linkage disequilibrium (LD), at least locally, Bonferonni correction of multiple comparisons often leads to conservative error control and therefore lower statistical power. Motivated by an application for analysis of data from the genetic association studies, we consider the problem of false discovery rate (FDR) control under the high dimensional multivariate normal model. Using the compound decision rule framework, we develop an optimal joint oracle procedure and propose to use a marginal procedure to approximate the optimal joint optimal procedure. We show that the marginal plug-in procedure is asymptotically optimal under mild conditions. Our results indicate that the multiple testing procedure developed under the independent model is not only valid but also asymptotically optimal for the high dimensional multivariate normal data under some weak dependency. We evaluate various procedures using simulation studies and demonstrate its application to a genome-wide association study of neuroblastoma (NB). The proposed procedure identified a few more genetic variants that are potentially associated with NB than the standard p-value-based FDR controlling procedure.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2010-01-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection