BART: Bayesian Additive Regression Trees

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
Bayesian backfitting
boosting
CART
classification
ensemble
MCMC
nonparametric regression
probit model
random basis
regularizatio
sum-of-trees model
variable selection
weak learner
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Chipman, Hugh A
George, Edward I
McCulloch, Robert E
Contributor
Abstract

We develop a Bayesian “sum-of-trees” model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BART’s many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2010-01-01
Journal title
Annals of Applied Statistics
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection