
Statistics Papers
Document Type
Journal Article
Date of this Version
11-2016
Publication Source
Mathematics of Operations Research
Volume
41
Issue
4
Start Page
1448
Last Page
1468
DOI
10.1287/moor.2016.0784
Abstract
We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.
Keywords
non-homogeneous Markov chain, central limit theorem, Markov decision problem, sequential decision, dynamic inventory management, alternating subsequence
Recommended Citation
Arlotto, A., & Steele, J. M. (2016). A Central Limit Theorem for Temporally Non-Homogenous Markov Chains with Applications to Dynamic Programming. Mathematics of Operations Research, 41 (4), 1448-1468. http://dx.doi.org/10.1287/moor.2016.0784
Date Posted: 27 November 2017
This document has been peer reviewed.