Machine Learning Econometrics

Philippe Goulet Coulombe, University of Pennsylvania


Much of econometrics is based on a tight probabilistic approach to empirical modeling that dates back to Haavelmo (1944). This thesis explores a modern algorithmic view, and by doing so, finds solutions to classic problems while developing new avenues.

In the first chapter, Kalman-filter based computations of random walk coefficients are replaced by a closed-form solution only second to least squares in the pantheon of simplicity.

In the second chapter, random walk “drifting” coefficients are themselves dismissed. Rather, evolving coefficients are modeled and forecasted with a powerful machine learning algorithm. Conveniently, this generalization of time-varying parameters provides statistical efficiency and interpretability, which off-the-shelf machine learning algorithms cannot easily offer.

The third chapter is about the to the fundamental problem of detecting at which point a learner stops learning and starts imitating. It answers “why can’t Random Forest overfit?” The phenomenon is shown to be a surprising byproduct of randomized “greedy” algorithms – often deployed in the face of computational adversity. Then, the insights are utilized to develop new high-performing non-overfitting algorithms.