
Statistics Papers
Document Type
Journal Article
Date of this Version
6-2002
Publication Source
Neural Networks
Volume
15
Issue
4-6
Start Page
549
Last Page
559
DOI
10.1016/S0893-6080(02)00048-5
Abstract
In the temporal difference model of primate dopamine neurons, their phasic activity reports a prediction error for future reward. This model is supported by a wealth of experimental data. However, in certain circumstances, the activity of the dopamine cells seems anomalous under the model, as they respond in particular ways to stimuli that are not obviously related to predictions of reward. In this paper, we address two important sets of anomalies, those having to do with generalization and novelty. Generalization responses are treated as the natural consequence of partial information; novelty responses are treated by the suggestion that dopamine cells multiplex information about reward bonuses, including exploration bonuses and shaping bonuses. We interpret this additional role for dopamine in terms of the mechanistic attentional and psychomotor effects of dopamine, having the computational role of guiding exploration.
Copyright/Permission Statement
© 2002. This manuscript version is made available under the CC-BY-NC-ND 4.0 license.
Keywords
dopamine, reinforcement learning, exploration, temporal difference, generalization
Recommended Citation
Kakade, S., & Dayan, P. (2002). Dopamine: Generalization and Bonuses. Neural Networks, 15 (4-6), 549-559. http://dx.doi.org/10.1016/S0893-6080(02)00048-5
Included in
Biochemistry, Biophysics, and Structural Biology Commons, Statistics and Probability Commons
Date Posted: 27 November 2017
This document has been peer reviewed.