Essays On Random Forest Ensembles

Olson, Matthew

Essays On Random Forest Ensembles

Files

Olson_upenngdas_0175C_13159.pdf (6.94 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Statistics

Subject

AdaBoost
deep learning
kernel methods
random forest
Artificial Intelligence and Robotics
Statistics and Probability

Copyright date

2018-09-27T20:18:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/29681

View all metadata

Author

Olson, Matthew

Abstract

A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. In the first part of this thesis, we demonstrate that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. We explore the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. In the second part of this thesis, we place a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. We then analyze the parameters that control the bandwidth of this kernel and discuss useful generalizations.

Advisor

Abraham J. Wyner

Date of degree

2018-01-01

Collection

Dissertations and Theses