Improving Observational Causality Using Machine Learning

Liu, Tong

Improving Observational Causality Using Machine Learning

Files

Liu_upenngdas_0175C_16267.pdf (9.35 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences
Public Health
Statistics and Probability

Subject

Machine Learning
Observational Causality
Quasi-experiments

Copyright date

01/01/2024

Permalink

https://repository.upenn.edu/handle/20.500.14332/60083

View all metadata

Author

Liu, Tong

Abstract

Causality is at the heart of many machine learning questions whether we know it or not, and we need to explicitly incorporate causal reasoning in order to answer them effectively. By a similar token, traditional causal inference methods can benefit from machine learning to adapt to more complex data domains. This thesis will explore the interplay between observational causal inference and machine learning, focusing on improving different aspects of the causal inference study lifecycle. Namely, we develop methods that facilitate the discovery of new study opportunities, improve the feasibility of existing studies, and allow for better interpretation of the resulting causal estimates. As identifying causal inference opportunities is currently a manual process requiring human intuition, we first develop a scaleable method for data-driven discovery of regression discontinuities, a class of observational causal inference methods. Next, we re-frame observational study exclusion criteria as a well-posed machine learning task, increasing interpretability by characterizing the excluded units. Both our discovery and exclusion criteria methods explicitly account for maximizing statistical power to increase study feasibility, and both are evaluated for their real-world efficacy on a medical claims dataset with over 60 million patients. Finally, we show the utility of incorporating machine learning into the causal study lifecycle through a large-scale study of the impact of civility in online social interactions. Through these works, we highlight not only how machine learning can improve causal inference in observational data settings but also the need to consider causality across traditional machine learning tasks.

Advisor

Ungar, Lyle, H
Kording, Konrad, P

Date of degree

2024

Collection

Dissertations and Theses