Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Graduate Group

Epidemiology & Biostatistics

First Advisor

Dylan S. Small


Semiparametric doubly robust methods for causal inference help protect against bias due to model misspecification, while also reducing sensitivity to the curse of dimensionality (e.g., when high-dimensional covariate adjustment is necessary). However, doubly robust methods have not yet been developed in numerous important settings. In particular, standard semiparametric theory mostly only considers independent and identically distributed samples and smooth parameters that can be estimated at classical root-n rates. In this dissertation we extend this theory and develop novel methodology for three settings outside these bounds: (1) matched cohort studies, (2) nonparametric dose-response estimation, and (3) complex high-dimensional effects with continuous instrumental variables. After giving an introduction in Chapter 1, we show in Chapter 2 that, for matched cohort studies, efficient and doubly robust estimators of effects on the treated are computationally equivalent to standard estimators that ignore the non-standard sampling. We also show that matched cohort studies are often more efficient than random sampling for estimating effects on the treated, and derive the optimal number of matches for given matching variables. We apply our methods in a study of the effect of hysterectomy on the risk of cardiovascular disease. In Chapter 3 we develop a novel approach for causal dose-response curve estimation that is doubly robust without requiring any parametric assumptions, and which naturally incorporates general off-the-shelf machine learning. We derive asymptotic properties for a kernel-based version of our approach and propose a data-driven method for bandwidth selection. The methods are used to study the effect of hospital nurse staffing on excess readmissions penalties. In Chapter 4 we develop novel estimators of the local instrumental variable curve, which represents the treatment effect among compliers who would take treatment when the instrument passes some threshold. Our methods do not require parametric assumptions, allow for flexible data-adaptive estimation of effect modification, and are doubly robust. We derive asymptotic properties under weak conditions, and use the methods to study infant mortality effects of neonatal intensive care units with high versus low technical capacity, using travel time as an instrument.

Included in

Biostatistics Commons