Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)

Graduate Group


First Advisor

Amit Ghandi

Second Advisor

Xu Cheng


Recent advances in machine learning literature provide a series of new algorithms that both address endogeneity and can be applied in high-dimensional environments, we call them MLIV. These algorithms are data-driven and exploit various forms of regularization to ameliorate the ill-posedness of the problem while maintaining the functional form flexibility. In this thesis, we discuss how MLIV estimators can be used to answer economic questions.

In the first chapter, Causal Gradient Boosting: Boosted Instrumental Variables Regression, we propose an MLIV algorithm called boostIV that builds on the traditional gradient boosting algorithm and corrects for the endogeneity bias. The algorithm is very intuitive and resembles an iterative version of the standard 2SLS estimator. The second chapter, Automatic Debiased Machine Learning in Presence of Endogeneity, introduces an approach for performing valid asymptotic inference on regular functionals of MLIV estimators. The approach is based on construction of an orthogonal moment function that has a zero derivative with respect to the MLIV estimator. We develop a penalized GMM estimator of the bias correction term necessary to obtain asymptotically normal debiased estimates and derive its convergence rate. We also give conditions for root-n consistency and asymptotic normality of the debiased MLIV estimator of the functional of interest. Finally, in the third chapter, Flexible Demand Estimation using Machine Learning, we demonstrate how to estimate substitution patterns in the market for sodas using the debiasing procedure from the second chapter.

These three chapters are highly interconnected. The first chapter proposes a new MLIV algorithm for flexible estimation in presence of endogenous regressors. However, it focuses on the underlying structural function which in the majority of cases does not have a clear economic interpretation. While the second chapter develops a method to perform inference on functionals of MLIV estimators, which have a clear economic interpretation and can be used to answer various economic questions of interest. Finally, the third chapter investigates an important applied question of flexible estimation of demand for differentiated goods, which is a perfect example of a high-dimensional problem with endogenous regressors. As a result, we get a full picture about the potential of MLIV methods in economics.

Included in

Economics Commons