Improving Observational Causality Using Machine Learning

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Computer and Information Science
Discipline
Computer Sciences
Public Health
Statistics and Probability
Subject
Machine Learning
Observational Causality
Quasi-experiments
Funder
Grant number
License
Copyright date
01/01/2024
Distributor
Related resources
Author
Liu, Tong
Contributor
Abstract

Causality is at the heart of many machine learning questions whether we know it or not, and we need to explicitly incorporate causal reasoning in order to answer them effectively. By a similar token, traditional causal inference methods can benefit from machine learning to adapt to more complex data domains. This thesis will explore the interplay between observational causal inference and machine learning, focusing on improving different aspects of the causal inference study lifecycle. Namely, we develop methods that facilitate the discovery of new study opportunities, improve the feasibility of existing studies, and allow for better interpretation of the resulting causal estimates. As identifying causal inference opportunities is currently a manual process requiring human intuition, we first develop a scaleable method for data-driven discovery of regression discontinuities, a class of observational causal inference methods. Next, we re-frame observational study exclusion criteria as a well-posed machine learning task, increasing interpretability by characterizing the excluded units. Both our discovery and exclusion criteria methods explicitly account for maximizing statistical power to increase study feasibility, and both are evaluated for their real-world efficacy on a medical claims dataset with over 60 million patients. Finally, we show the utility of incorporating machine learning into the causal study lifecycle through a large-scale study of the impact of civility in online social interactions. Through these works, we highlight not only how machine learning can improve causal inference in observational data settings but also the need to consider causality across traditional machine learning tasks.

Advisor
Ungar, Lyle, H
Kording, Konrad, P
Date of degree
2024
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation