Characterizing Markov Decision Processes

Ratitch, Bohdana; Precup, Doina

doi:10.1007/3-540-36755-1_33

Bohdana Ratitch² &
Doina Precup²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2430))

Included in the following conference series:

European Conference on Machine Learning

2397 Accesses

Abstract

Problem characteristics often have a significant influence on the difficulty of solving optimization problems. In this paper, we propose attributes for characterizing Markov Decision Processes (MDPs), and discuss how they affect the performance of reinforcement learning algorithms that use function approximation. The attributes measure mainly the amount of randomness in the environment. Their values can be calculated from the MDP model or estimated on-line. We show empirically that two of the proposed attributes have a statistically significant effect on the quality of learning. We discuss how measurements of the proposed MDP attributes can be used to facilitate the design of reinforcement learning systems.

Download to read the full chapter text

Chapter PDF

Value Function Approximation

Algorithmic Foundations of Reinforcement Learning

Approximating Euclidean by Imprecise Markov Decision Processes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bertsekas, D. P., Tsitsiklis, J. N.: Neuro-Dynamic Programming. Belmont, MA: Athena Scientific (1996)
MATH Google Scholar
Cohen, P. R.: Empirical Methods for Artificial Intelligence. Cambridge, MA: The MIT Press (1995)
MATH Google Scholar
Dean, T., Kaelbling, L., Kirman, J., Nicholson, A.: Planning under Time Constraints in Stochastic Domains. Artificial Intelligence 76(1–2) (1995) 35–74
Article Google Scholar
Dearden, R., Friedman, N., Andre, D.: Model-Based Bayesian Exploration. In Uncertainty in Artificial Intelligence: Proceedings of the Fifteenth Conference (UAI-1999) 150–159
Google Scholar
Gordon, J. G.: Reinforcement Learning with Function Approximation Converges to a Region. Advances in Neural Information Processing Systems 13 (2001) 1040–1046
Google Scholar
Hogg, T., Huberman, B. A., Williams, C. P.: Phase Transitions and the Search Problem (Editorial). Artificial Intelligence, 81 (1996) 1–16
Article MathSciNet Google Scholar
Hoos, H. H., Stutzle, T.: Local Search Algorithms for SAT: An Empirical Evaluation. Journal of Automated Reasoning, 24 (2000) 421–481.
Article MATH Google Scholar
Kearns, M., Singh, S.: Near-Optimal Reinforcement Learning in Polynomial Time. In Proceedings of the 15th International Conference on Machine Learning (1998) 260–268
Google Scholar
Kirman, J.: Predicting Real-Time Planner Performance by Domain Characterization. Ph.D. Thesis, Brown University (1995)
Google Scholar
Lagoudakis, M., Littman, M. L.: Algorithm Selection using Reinforcement Learning Proceedings of the 17th International Conference on Machine Learning (2000) 511–518
Google Scholar
Meuleau, N., Bourgine, P.: Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty. Machine Learning 35(2) (1999) 117–154
Article MATH Google Scholar
Moore, A. W., Atkeson, C. G.: Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time. Machine Learning, 13 (1993) 103–130
Google Scholar
Papadimitriou, C. H., Steiglitz, K: Combinatorial Optimization: Algorithms and Complexity. Prentice Hall (1982)
Google Scholar
Papadimitriou, C. H., Tsitsiklis, J. N.: The Complexity of Markov Chain Decision Processes. Mathematics of Operations Research 12(3) (1987) 441–450
Article MATH MathSciNet Google Scholar
Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (1994)
Google Scholar
Singh, S. P., Jaakkola, T., Jordan, M. I.: Reinforcement Learning with Soft State Aggregation. Advances in Neural Information Processing Systems, 7 (1995) 361–368
Google Scholar
Sutton, R. S., Barto, A. G.: Reinforcement Learning. An Introduction. Cambridge, MA: The MIT Press (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

McGill University, Montreal, Canada
Bohdana Ratitch & Doina Precup

Authors

Bohdana Ratitch
View author publications
You can also search for this author in PubMed Google Scholar
Doina Precup
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Helsinki, P.O. Box 26, 00014, Helsinki, Finland
Tapio Elomaa , Heikki Mannila & Hannu Toivonen , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ratitch, B., Precup, D. (2002). Characterizing Markov Decision Processes. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Machine Learning: ECML 2002. ECML 2002. Lecture Notes in Computer Science(), vol 2430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36755-1_33

Download citation

DOI: https://doi.org/10.1007/3-540-36755-1_33
Published: 20 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44036-9
Online ISBN: 978-3-540-36755-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics