Operations, Information and Decisions Papers

Document Type

Journal Article

Date of this Version

7-2014

Publication Source

Operations Research

Volume

62

Issue

4

Start Page

864

Last Page

875

DOI

10.1287/opre.2014.1281

Abstract

We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple linear function of its expected value. The class is characterized by three natural properties: reward nonnegativity and boundedness, existence of a do-nothing action, and optimal action monotonicity. These properties are commonly present and typically easy to check. Implications of the class properties and of the variance bound are illustrated by examples of MDPs from operations research, operations management, financial engineering, and combinatorial optimization.

Keywords

Markov decision problems, variance bounds, optimal total reward

 

Date Posted: 27 November 2017

This document has been peer reviewed.