Date of this Version
We consider a single-product revenue management problem where, given an initial inventory, the objective is to dynamically adjust prices over a finite sales horizon to maximize expected revenues. Realized demand is observed over time, but the underlying functional relationship between price and mean demand rate that governs these observations (otherwise known as the demand function or demand curve) is not known. We consider two instances of this problem: (i) a setting where the demand function is assumed to belong to a known parametric family with unknown parameter values; and (ii) a setting where the demand function is assumed to belong to a broad class of functions that need not admit any parametric representation. In each case we develop policies that learn the demand function “on the fly,” and optimize prices based on that. The performance of these algorithms is measured in terms of the regret: the revenue loss relative to the maximal revenues that can be extracted when the demand function is known prior to the start of the selling season. We derive lower bounds on the regret that hold for any admissible pricing policy, and then show that our proposed algorithms achieve a regret that is “close” to this lower bound. The magnitude of the regret can be interpreted as the economic value of prior knowledge on the demand function, manifested as the revenue loss due to model uncertainty.
revenue management, pricing, estimation, learning, exploration-exploitation, value of information, asymptotic analysis
Besbes, O., & Zeevi, A. (2009). Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms. Operations Research, 57 (6), 1407-1420. http://dx.doi.org/10.1287/opre.1080.0640
Date Posted: 27 November 2017
This document has been peer reviewed.