Optimal Screening and Discovery of Sparse Signals with Applications to Multistage High-throughput Studies

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
adaptive design
classification
data screening
false discovery rate
false negative rate
phase transition
Business
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Cai, Tony
Sun, Wenguang
Contributor
Abstract

A common feature in large-scale scientific studies is that signals are sparse and it is desirable to significantly narrow down the focus to a much smaller subset in a sequential manner. In this paper, we consider two related data screening problems: One is to find the smallest subset such that it virtually contains all signals and another is to find the largest subset such that it essentially contains only signals. These screening problems are closely connected to but distinct from the more conventional signal detection or multiple testing problems. We develop data-driven screening procedures which control the error rates with near optimality properties and study how to design the experiments efficiently to achieve the goals in data screening. A class of new phase diagrams is developed to characterize the fundamental limitations in simultaneous inference. An application to multistage high-throughput studies is given to illustrate the merits of the proposed screening methods.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2017-01-01
Journal title
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection