Removing Strong Data Assumptions In Causal Inference Via Large-Scale Optimization

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Applied Mathematics
Discipline
Subject
Applied Mathematics
Biostatistics
Statistics and Probability
Funder
Grant number
License
Copyright date
2022-10-05T20:22:00-07:00
Distributor
Related resources
Author
Heng, Siyu
Contributor
Abstract

Many traditional and newly-developed causal inference approaches require imposing strong data assumptions, and if those assumptions were violated in practice, these approaches may be inapplicable, suffer from low statistical power, or lead to misleading causal conclusions. In this dissertation, we present three papers to show how large-scale optimization can sometimes aid in removing strong assumptions about the data generating process or the data collection procedure that are required by some existing causal inference approaches. The first and second papers show how large-scale optimization can sometimes help remove strong assumptions about the data generating process. In the first paper, a new adaptive approach is proposed to combine two test statistics in matched observational studies. The proposed adaptive approach asymptotically uniformly dominates both of the two component test statistics in sensitivity analyses, regardless of the underlying data distribution. In the second paper, a model-free and finite-population-exact framework is proposed to analyze randomized experiments subject to outcome misclassification. This new framework is based on large-scale integer programming and can help researchers analyze a randomized experiment subject to outcome misclassification in a more comprehensive way without imposing any additional assumptions on a randomized experiment. The third paper illustrates how large-scale optimization can help remove strong assumptions about the data collection procedure. Specifically, to study the effect of reducing malaria burden on the low birth weight rate in sub-Saharan Africa, a pair-of-pairs approach to a difference-in-differences study is proposed, which is built on optimal matching (a large-scale network flow problem) and cardinality matching (a large-scale integer programming problem). Unlike the traditional difference-in-differences studies, this pair-of-pairs approach does not require either panel data or repeated cross-sectional data to be collected before the analysis stage.

Advisor
Dylan S. Small
Date of degree
2022-01-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation