Statistics Papers

Document Type

Conference Paper

Date of this Version

2002

Publication Source

Proceedings of the Nineteenth International Conference on Machine Learning

Start Page

339

Last Page

346

Abstract

We investigate the explore/exploit trade-off in reinforcement learning using competitive analysis applied to an abstract model. We state and prove lower and upper bounds on the competitive ratio. The essential conclusion of our analysis is that optimizing the explore/exploit trade-off is much easier with a few pieces of extra knowledge such as the stopping time or upper and lower bounds on the value of the optimal exploitation policy.

Share

COinS
 

Date Posted: 27 November 2017