Improving Observational Causality Using Machine Learning
Degree type
Graduate group
Discipline
Public Health
Statistics and Probability
Subject
Observational Causality
Quasi-experiments
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
Causality is at the heart of many machine learning questions whether we know it or not, and we need to explicitly incorporate causal reasoning in order to answer them effectively. By a similar token, traditional causal inference methods can benefit from machine learning to adapt to more complex data domains. This thesis will explore the interplay between observational causal inference and machine learning, focusing on improving different aspects of the causal inference study lifecycle. Namely, we develop methods that facilitate the discovery of new study opportunities, improve the feasibility of existing studies, and allow for better interpretation of the resulting causal estimates. As identifying causal inference opportunities is currently a manual process requiring human intuition, we first develop a scaleable method for data-driven discovery of regression discontinuities, a class of observational causal inference methods. Next, we re-frame observational study exclusion criteria as a well-posed machine learning task, increasing interpretability by characterizing the excluded units. Both our discovery and exclusion criteria methods explicitly account for maximizing statistical power to increase study feasibility, and both are evaluated for their real-world efficacy on a medical claims dataset with over 60 million patients. Finally, we show the utility of incorporating machine learning into the causal study lifecycle through a large-scale study of the impact of civility in online social interactions. Through these works, we highlight not only how machine learning can improve causal inference in observational data settings but also the need to consider causality across traditional machine learning tasks.
Advisor
Kording, Konrad, P