
Management Papers
Title
Near-Term Liability of Exploitation: Exploration and Exploitation in Multistage Problems
Document Type
Journal Article
Date of this Version
5-2009
Publication Source
Organization Science
Volume
20
Issue
3
Start Page
538
Last Page
551
DOI
10.1287/orsc.1080.0376
Abstract
The classic trade-off between exploration and exploitation reflects the tension between gaining new information about alternatives to improve future returns and using the information currently available to improve present returns. By considering these issues in the context of a multistage, as opposed to a repeated, problem environment, we show that exploratory behavior has value quite apart from its role in revising beliefs. We show that even if current beliefs provide an unbiased characterization of the problem environment, maximizing with respect to these beliefs may lead to an inferior expected payoff relative to other mechanisms that make less aggressive use of the organization's beliefs. Search can lead to more robust actions in multistage decision problems than maximization, a benefit quite apart from its role in the updating of beliefs.
Keywords
exploration and exploitation, maximization, multistage problems, reinforcement learning, softmax choice rule
Recommended Citation
Fang, C., & Levinthal, D. A. (2009). Near-Term Liability of Exploitation: Exploration and Exploitation in Multistage Problems. Organization Science, 20 (3), 538-551. http://dx.doi.org/10.1287/orsc.1080.0376
Date Posted: 27 November 2017
This document has been peer reviewed.