Essays On Random Forest Ensembles
Degree type
Graduate group
Discipline
Subject
deep learning
kernel methods
random forest
Artificial Intelligence and Robotics
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
A random forest is a popular machine learning ensemble method that has proven successful in solving a wide range of classification problems. While other successful classifiers, such as boosting algorithms or neural networks, admit natural interpretations as maximum likelihood, a suitable statistical interpretation is much more elusive for a random forest. In the first part of this thesis, we demonstrate that a random forest is a fruitful framework in which to study AdaBoost and deep neural networks. We explore the concept and utility of interpolation, the ability of a classifier to perfectly fit its training data. In the second part of this thesis, we place a random forest on more sound statistical footing by framing it as kernel regression with the proximity kernel. We then analyze the parameters that control the bandwidth of this kernel and discuss useful generalizations.