Abstract
An option is a policy fragment that represents a solution to a frequent subproblem encountered in a domain. Options may be treated as temporally extended actions thus allowing us to reuse that solution in solving larger problems. Often, it is hard to find subproblems that are exactly the same. These differences, however small, need to be accounted for in the reused policy. In this paper, the notion of options with exceptions is introduced to address such scenarios. This is inspired by the Ripple Down Rules approach used in data mining and knowledge representation communities. The goal is to develop an option representation so that small changes in the subproblem solutions can be accommodated without losing the original solution. We empirically validate the proposed framework on a simulated game domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barto, A.G., Mahadevan, S.: Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems 13(1-2) (2003)
Taylor, M.E., Stone, P.: Transfer Learning for Reinforcement Learning Domains: A Survey. Journal of Machine Learning Research 10, 1633–1685 (2009)
Dietterich, T.G.: Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
McCallum, A.K.: Reinforcement Learning with Selective Perception and Hidden State, Ph.D. Thesis, Department of Computer Science, The College of Arts and Science, University of Rocheater, USA (1995)
Asadi, M., Huber, M.: Autonomous Subgoal Discovery and Hierarchical Abstraction Learned Policies. In: FLAIRS Conference, pp. 346–350 (2003)
Gaines, B.R., Compton, P.: Induction of Ripple-Down Rules Applied to Modeling Large Database. Knowledge Acquisition 2(3), 241–258 (1995)
McGovern, A.: Autonomous Discovery of Temporal Abstraction from Interaction with An Environment, Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA (2002)
Precup, D.: Temporal Abstraction in Reinforcement Learning, Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA (2000)
McGovern, A., Barto, A.G.: Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density. In: Proc. 18th International Conf. on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)
Bradtke, S.J., Duff, M.O.: Reinforcement Learning Methods for Continuous-Time Markov Decision Problems. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 393–400. The MIT Press (1995)
Sutton, R.S., Precup, D.: Intra-option learning about temporally abstract actions. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 556–564. Morgan Kaufman (1998)
Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 167–173 (1993)
Ravindran, B., Barto, A.G.: Relativized Options: Choosing the Right Transformation. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 608–615 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sairamesh, M., Ravindran, B. (2012). Options with Exceptions. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-29946-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29945-2
Online ISBN: 978-3-642-29946-9
eBook Packages: Computer ScienceComputer Science (R0)