Statistics Papers

Document Type

Journal Article

Date of this Version


Publication Source

The Annals of Applied Statistics





Start Page


Last Page





Understanding how effective high-level NICUs (neonatal intensive care units that have the capacity for sustained mechanical assisted ventilation and high volume) are compared to low-level NICUs is important and valuable for both individual mothers and for public policy decisions. The goal of this paper is to estimate the effect on mortality of premature babies being delivered in a high-level NICU vs. a low-level NICU through an observational study where there are unmeasured confounders as well as nonignorable missing covariates. We consider the use of excess travel time as an instrumental variable (IV) to control for unmeasured confounders. In order for an IV to be valid, we must condition on confounders of the IV—outcome relationship, for example, month prenatal care started must be conditioned on for excess travel time to be a valid IV. However, sometimes month prenatal care started is missing, and the missingness may be nonignorable because it is related to the not fully measured mother’s/infant’s risk of complications. We develop a method to estimate the causal effect of a treatment using an IV when there are nonignorable missing covariates as in our data, where we allow the missingness to depend on the fully observed outcome as well as the partially observed compliance class, which is a proxy for the unmeasured risk of complications. A simulation study shows that under our nonignorable missingness assumption, the commonly used estimation methods, complete-case analysis and multiple imputation by chained equations assuming missingness at random, provide biased estimates, while our method provides approximately unbiased estimates. We apply our method to the NICU study and find evidence that high-level NICUs significantly reduce deaths for babies of small gestational age, whereas for almost mature babies like 37 weeks, the level of NICUs makes little difference. A sensitivity analysis is conducted to assess the sensitivity of our conclusions to key assumptions about the missing covariates. The method we develop in this paper may be useful for many observational studies facing similar issues of unmeasured confounders and nonignorable missing data as ours.

Copyright/Permission Statement

The original published work is available at:


instrumental variable, causal inference, sensitivity analysis, nonignorable missing data



Date Posted: 27 November 2017

This document has been peer reviewed.