Statistics Papers

Statistics Papers

 

The aim of statistical modeling is to empower effective decision making, and the unique contribution of the field is its ability to incorporate multiple levels of uncertainty in the framing of wise decisions. Over the last few years, the development of new computational tools and the unprecedented evolution of “big data” have propelled statistical modeling to new levels. Today statistical modeling and machine learning have reached a level of impact that no large organization can afford to ignore. The information landscape is changing as it has never changed before.

At Wharton, the Department of Statistics is proud to have had a leadership role in this development. It participates in a wide range of university consortia that spans the fields of computer science, neuroscience, medicine, public policy, and finance. Moreover, our faculty members have won singular international recognition for their contributions to many parts of statistical science including observational studies, statistical algorithms, game theory, high dimensional inference, information theory, nonparametric function estimation, model selection, time series analysis, machine learning, and probability theory.

Follow


Papers from 2017

PDF

The Discrete Voronoi game in ℝ2, Aritra Banik, Bhaswar B. Bhattacharya, Sandip Das, and Satyaki Mukherjee

PDF

Weighted False Discovery Rate Control in Large-Scale Multiple Testing, Pallavi Basu, Tony Cai, Kiranmoy Das, and Wenguang Sun

PDF

Universal Limit Theorems in Graph Coloring Problems With Connections to Extremal Combinatorics, Bhaswar B. Bhattacharya, Persi Diaconis, and Sumit Mukherjee

PDF

Degree Sequence of Random Permutation Graphs, Bhaswar B. Bhattacharya and Sumit Mukherjee

PDF

Adaptive Estimation of Planar Convex Sets, Tony Cai, Adityanand Guntuboyina, and Yuting Wei

PDF

Confidence Intervals for High-Dimensional Linear Regression: Minimax Rates and Adaptivity, Tony Cai and Zijian Guo

PDF

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix, Tony Cai, Tengyuan Liang, and Alexander Rakhlin

PDF

Optimal Screening and Discovery of Sparse Signals with Applications to Multistage High-throughput Studies, Tony Cai and Wenguang Sun

PDF

Stationary Gaussian Markov Processes as Limits of Stationary Autoregressive Time Series, Philip A. Ernst, Lawrence D. Brown, Larry Shepp, and Robert L. Wolpert

PDF

Mortality Rate Estimation and Standardization for Public Reporting: Medicare's Hospital Compare, Edward I. George, Veronika Ročková, Paul R. Rosenbaum, Ville A. Satopää, and Jeffrey H. Silber

PDF

Mediation Analysis for Count and Zero-Inflated Count Data without Sequential Ignorability and Its Application in Dental Studies, Zijian Guo, Dylan S. Small, Stuart A. Gansky, and Jing Cheng

PDF

An Exact Test of Fit for the Gaussian Linear Model using Optimal Nonbipartite Matching, Samuel D. Pimentel, Dylan S. Small, and Paul R. Rosenbaum

PDF

Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments, Eric M. Schwartz, Eric T. Bradlow, and Peter S. Fader

PDF

Explaining Normal Quantile-Quantile Plots Through Animation: The Water-Filling Analogy, Robert A. Stine

Papers from 2016

PDF

A Central Limit Theorem for Temporally Non-Homogenous Markov Chains with Applications to Dynamic Programming, Alessandro Arlotto and J Michael Steele

PDF

Beardwood-Halton-Hammersly Theorem for Stationary Ergodic Sequences: A Counterexample, Alessandro Arlotto and J. Michael Steele

PDF

Almost Empty Monochromatic Triangles in Planar Point Sets, Deepan Basu, Kinjal Basu, Bhaswar B. Bhattacharya, and Sandip Das

PDF

Collision Times in Multicolor Urn Models and Sequential Graph Coloring With Applications to Discrete Logarithms, Bhaswar B. Bhattacharya

PDF

Disjoint Empty Convex Pentagons in Planar Point Sets, Bhaswar B. Bhattacharya and Sandip Das

PDF

High-Temperature Asymptotics of Orthogonal Mean-Field Spin Glasses, Bhaswar B. Bhattacharya and Subhabrata Sen

PDF

A Semiparametric Multivariate Partially Linear Model: A Difference Approach, Lawrence D. Brown, Michael Levine, and Lie Wang

PDF

Reflections on the Occasion of the 100th Anniversary of the Monthly Labor Review, Lawrence D. Brown, Lisa M. Lynch, and Constance F. Citro

PDF

Structured Matrix Completion with Applications to Genomic Data Integration, Tianxi Cai, T. Tony Cai, and Anru Zhang

PDF

Global Testing Against Sparse Alternatives in Time-Frequency Analysis, Tony Cai, Yonina C. Eldar, and Xiaodong Li

PDF

Accuracy Assessment for High-Dimensional Linear Regression, Tony Cai and Zijian Guo

PDF

Optimal Large-Scale Quantum State Tomography With Pauli Measurements, Tony Cai, Donggyu Kim, Yazhen Wang, Ming Yuan, and Harrison H. Zhou

PDF

Geometric Inference for General High-Dimensional Linear Inverse Problems, Tony Cai, Tengyuan Liang, and Alexander Rakhlin

PDF

Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation, Tony Cai, Weidong Liu, and Harrison H. Zhou

PDF

Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow, Tony Cai, Xiadong Li, and Zongming Ma

PDF

Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation, Tony Cai, Zhao Ren, and Harrison H. Zhou

PDF

Matrix Completion via Max-Norm Constrained Optimization, Tony Cai and Wen-Xin Zhou

PDF

Large-Scale Multiple Testing of Correlations, T. Tony Cai and Weidong Liu

PDF

Minimax and Adaptive Estimation of Covariance Operator for Random Variables Observed on a Lattice Graph, T. Tony Cai and Ming Yuan

PDF

Inference for High-Dimensional Differential Correlation Matrices, T. Tony Cai and Anru Zhang

PDF

Minimax Rate-Optimal Estimation of High-Dimensional Covariance Matrices with Incomplete Data, T. Tony Cai and Anru Zhang

PDF

Estimating an NBA Player’s Impact on is Team’s Chances of Winning, Sameer K. Deshpande and Shane T. Jensen

PDF

Patterns of Adherence to Oral Hypoglycemic Agents and Glucose Control among Primary Care Patients with Type 2 Diabetes, Heather F. de Vries McClintock, Knashawn H. Morales, Dylan S. Small, and Hillary R. Bogner

PDF

Sparse CCA: Adaptive Estimation and Computational Barriers, Chao Gao, Zongming Ma, and Harrison Zhou

PDF

Familywise Error Rate Control via Knockoffs, Lucas Janson and Weijie Su

PDF

Impartial Predictive Modeling: Ensuring Fairness in Arbitrary Models, Kory D. Johnson, Dean P. Foster, and Robert A. Stine

PDF

Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization, Hyunseung Kang, Anru Zhang, Tony Cai, and Dylan Small

PDF

Nonparametric Methods for Doubly Robust Estimation of Continuous Treatment Effects, Edward H. Kennedy, Zongming Ma, Matthew D. McHugh, and Dylan S. Small

PDF

Mouse Label-Retaining Cells are Molecularly and Functionally Distinct From Reserve Intestinal Stem Cells, Ning Li, Angela Nakauka-Ddamba, John Tobias, Shane T. Jensen, and Christopher J. Lengner

PDF

Heterogeneity in Readouts of Canonical Wnt Pathway Activity within Intestinal Crypts, Ning Li, Maryam Yousefi, Angela Nakauka-Ddamba, John W. Tobias, Shane T. Jensen, Edward E. Morrisey, and Christopher J. Lengner

PDF

Power Weighted Densities for Time Series Data, Daniel McCarthy and Shane T. Jensen

PDF

Efficient Empirical Bayes Prediction Under Check Loss Using Asymptotic Risk Estimates, Gourab Mukherjee, Lawrence D. Brown, and Paat Rusmevichientong

PDF

Sequential Selection of a Monotone Subsequence from a Random Permutation, Peichao Peng and J. Michael Steele

PDF

Constructed Second Control Groups and Attenuation of Unmeasured Biases, Samuel D. Pimentel, Dylan S. Small, and Paul R. Rosenbaum

PDF

Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity, Veronika Ročková and Edward I. George

PDF

The Spike-and-Slab LASSO, Veronika Ročková and Edward I. George

PDF

Bayes and Big Data: The Consensus Monte Carlo Algorithm, Steven L. Scott, Alexander W. Blocker, Fernando V. Bonassi, Hugh A. Chipman, Edward I. George, and Robert E. McCulloch

PDF

Comparison of the Value of Nursing Work Environments in Hospitals Across Different Levels of Patient Risk, Jeffrey H. Silber, Paul R. Rosenbaum, Matthew D McHugh, Justin M. Ludwig, Herbert L. Smith, Bijan A. Niknam, Orit Even-Shoshan, Lee A. Fleisher, Rachel R. Kelz, and Linda H. Aiken

PDF

The Bruss-Robertson Inequality: Elaborations, Extensions, and Applications, J. Michael Steele

PDF

False Discoveries Occur Early on the Lasso Path, Weijie Su, Malgorzata Bogdan, and Emmanuel Candès

PDF

SLOPE is Adaptive to Unknown Sparsity and Asymptotically Minimax, Weijie Su and Emmanuel Candès

PDF

Nonparametric Multi-Level Clustering of Human Epilepsy Seizures, Drausin F Wulsin, Shane T. Jensen, and Brian Litt

PDF

Optimal Shrinkage Estimation of Mean Parameters in Family of Distributions With Quadratic Variance, Xianchao Xie, Samuel C. Kou, and Lawrence D. Brown

PDF

Scanning a Poisson Random Field for Local Signals, Nancy R. Zhang, Benjamin Yakir, Charlie L. Xia, and David O. Siegmund

Papers from 2015

PDF

Potential Mechanisms for Cancer Resistance in Elephants and Comparative Cellular Response to DNA Damage in Humans, Lisa M. Abegglen, Aleah Fox Caulin, Ashley Chan, Kristy Lee, Rosann Robinson, Michael S. Campbell, Wendy K. Kiso, Dennis L. Schmitt, Peter J. Waddell, Srividya Bhaskara, Shane T. Jensen, Carlo C. Maley, and Joshua D. Schiffman

PDF

A Spectral Algorithm for Latent Dirichlet Allocation, Anima Anandkumar, Dean P. Foster, Daniel Hsu, Sham Kakade, and Yi-Kai Liu

PDF

OpenWAR: An Open Source System for Evaluating Overall Player Performance in Major League Baseball, Benjamin S. Baumer, Shane T. Jensen, and Gregory J. Matthews

PDF

Twitter Event Networks and the Superstar Model, Shankar Bhamidi, J Michael Steele, and Tauhid Zaman

PDF

Exact and Asymptotic Results on Coarse Ricci Curvature of Graphs, Bhaswar B. Bhattacharya and Sumit Mukherjee

PDF

SLOPE – Adaptive Variable Selection via Convex Optimization, Malgorzata Bogdan, Ewout Van Den Berg, Chiara Sabatti, Weijie Su, and Emmanuel Candès

PDF

Models as Approximations - A Conspiracy of Random Regressors and Model Deviations Against Classical Inference in Regression, Andreas Buja, Richard A. Berk, Lawrence D. Brown, Edward I. George, Emil Pitkin, Mikhail Traskin, Linda Zhao, and Kai Zhang

PDF

Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes, Tony Cai and Xiaodong Li

PDF

Law of Log Determinant of Sample Covariance Matrix and Optimal Estimation of Differential Entropy for High-Dimensional Gaussian Distributions, T. Tony Cai, Tengyuan Liang, and Harrison H. Zhou

PDF

Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices, T. Tony Cai, Zongming Ma, and Yihong Wu

PDF

Allele-Specific Copy Number Profiling by Next-Generation DNA Sequencing, Hao Chen, John M. Bell, Nicolas A. Zavala, Hanlee P. Ji, and Nancy R. Zhang

PDF

Graph-Based Change-Point Detection, Hao Chen and Nancy Zhang

PDF

Emergence of Hemagglutinin Mutations during the Course of Influenza Infection, Anna Cushing, Amanda Kamali, Mark Winters, Erik S. Hopmans, John M. Bell, Susan Grimes, Li C. Xia, Nancy R. Zhang, Ronald B. Moss, Mark Holodniy, and Hanlee P. Ji

PDF

Bayesian Integration of Genetics and Epigenetics Detects Causal Regulatory SNPs Underlying Expression Variability, Avinash Das, Michael Morley, Christine S. Moravec, W.H.W. Tang, Hakon Hakonarson, MAGNet Consortium, Kenneth B. Margulies, Thomas P. Cappola, Shane T. Jensen, and Sridhar Hannenhalli

PDF

A Brief Adherence Intervention that Improved Glycemic Control: Mediation by Patterns of Adherence, Heather F. de Vries McClintock, Knashawn H. Morales, Dylan S. Small, and Hillary R. Bogner

PDF

Neighborhood Social Environment and Patterns of Adherence to Oral Hypoglycemic Agents among Patients with Type 2 Diabetes Mellitus, Heather F. de Vries McClintock, Douglas J. Wiebe, Alison J. OʼDonnell, Knashawn H. Morales, Dylan S. Small, and Hillary R. Bogner

PDF

Disease Diagnosis From Immunoassays With Plate to Plate Variability: A Hierarchical Bayesian Approach, Oliver Entine, Dylan S. Small, Shane T. Jensen, Gerardo Sanchez Garcia, Milagros Bastos Mazuelos, Manuela R. Verastegui Pimentel, and Michael Z. Levy

PDF

Bayesian Hierarchical Regression on Clearance Rates in the Presence of "Lag" and "Tail" Phases with an Application to Malaria Parasites, Colin B. Fogarty, Michael P. Fay, Jennifer A. Flegg, Kasia Stepniewska, Rick M. Fairhurst, and Dylan S. Small

PDF

Risk Inflation of Sequential Tests Controlled by Alpha Investing, Dean P. Foster and Robert A. Stine

PDF

Supplement to "Minimax Estimation in Sparse Canonical Correlation Analysis", Chao Gao, Zongming Ma, Zhao Ren, and Harrison H. Zhou

PDF

Surrogate Markers for Time-Varying Treatments and Outcomes, Jesse Y. Hsu, Edward H. Kennedy, Jason A. Roy, Alisa J. Stephens-Shields, Dylan S. Small, and Marshall M. Joffe

PDF

Strong Control of the Familywise Error Rate in Observational Studies that Discover Effect Modification by Exploratory Methods, Jesse Y. Hsu, José R. Zubizarreta, Dylan S. Small, and Paul R. Rosenbaum

PDF

Medical Students in the Emergency Department and Patient Length of Stay, Kimon Ionnides, Mira Mamtani, Frances S. Shofer, Dylan S. Small, Sean Hennessey, Benjamin Abella, and Kevin Scott

PDF

CODEX: A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing, Yuchao Jiang, Derek A. Oldridge, Sharon J. Diskin, and Nancy R. Zhang

PDF

Optimal Restricted Estimation for More Efficient Longitudinal Causal Inference, Edward H. Kennedy, Marshall M. Joffe, and Dylan S. Small

PDF

Discussion of "Frequentist of Coverage of Adaptive Nonparametric Bayesian Credible Sets, Mark G. Low and Zongming Ma

PDF

Computational Barriers in Minimax Submatrix Detection, Zongming Ma and Yihong Wu

PDF

Robust Dimension Free Isoperimetry in Gaussian Space, Elchanan Mossel and Joe Neeman

PDF

Robust Optimality of Gaussian Noise Stability, Elchanan Mossel and Joe Neeman

PDF

Empirical Bayes Prediction for the Multivariate Newsvendor Loss Function, Gourab Mukherjee, Lawrence D. Brown, and Paat Rusmevichientong

PDF

Hospital-Based Acute Care Use in Survivors of Septic Shock, Alexandra Ortego, David F. Gaieski, Barry D. Fuchs, Tiffanie Jones, Scott D. Halpern, Dylan S. Small, S. Cham Sante, Byron Drumheller, Jason D. Christie, and Mark E. Mikkelsen

PDF

Memory Acquisition and Retrieval Impact Different Epigenetic Processes that Regulate Gene Expression, Lucia L. Peixoto, Mathieu E. Wimmer, Shane G. Poplawski, Jennifer C. Tudor, Charles A. Kenworthy, Shichong Liu, Keiko Mizuno, Benjamin A. Garcia, Nancy R. Zhang, K. Peter Giese, and Ted Abel

PDF

Large, Sparse Optimal Matching with Refined Covariate Balance in an Observational Study of the Health Outcomes Produced by New Surgeons, Samuel D. Pimentel, Rachel R. Kelz, Jeffrey H. Silber, and Paul R. Rosenbaum

PDF

Sequential Complexities and Uniform Martingale Laws of Large Numbers, Alexander Rakhlin, Karthik Sridharan, and Ambuj Tewari

PDF

Some Counterclaims Undermine Themselves in Observational Studies, Paul R. Rosenbaum

PDF

Examining Causes of Racial Disparities in General Surgical Mortality: Hospital Quality Versus Patient Risk, Jeffrey H. Silber, Paul R. Rosenbaum, Rachel R. Kelz, Darrell J. Gaskin, Justin M. Ludwig, Richard N. Ross, Bijan A. Niknam, Alex Hill, Min Wang, Orit Even-Shoshan, and Lee A. Fleisher

PDF

Bias in Estimating the Causal Hazard Ratio When Using Two-Stage Instrumental Variable Methods, Fei Wan, Dylan S. Small, Justin E. Bekelman, and Nandita Mitra

PDF

Testing Differential Networks with Applications to Detection of Gene-Gene Interactions, Yin Xia, Tianxi Cai, and T. Tony Cai

PDF

Allelic Variation Contributes to Bacterial Host Specificity, Min Yue, Xiangan Han, Leon De Masi, Chunhong Zhu, Xun Ma, Junjie Zhang, Renwei Wu, Robert Schmieder, Radhey S. Kaushik, George P. Fraser, Shaohua Zhao, Patrick F. McDermott, François-Xavier Weill, Jacques G. Mainil, Cesar Arze, W. Florian Fricke, Robert A. Edwards, Dustin Brisson, Nancy R. Zhang, Shelley C. Rankin, and Dieter M. Schifferli

Papers from 2014

PDF

The Use of Bootstrapping when Using Propensity-Score Matching without Replacement: A Simulation Study, Peter C. Austin and Dylan S. Small

PDF

Minimum Enclosing Circle of a Set of Fixed Points and a Mobile Point, Aritra Banik, Bhaswar B. Bhattacharya, and Sandip Das

PDF

Misspecified Mean Function Regression: Making Good Use of Regression Models That Are Wrong, Richard A. Berk, Lawrence D. Brown, Andreas Buja, Edward I. George, Emil Pitkin, Kai Zhang, and Linda Zhao