Goal-Directed Online Learning of Predictive Models

Ong, Sylvie C. W.; Grinberg, Yuri; Pineau, Joelle

doi:10.1007/978-3-642-29946-9_6

Sylvie C. W. Ong²¹,
Yuri Grinberg²¹ &
Joelle Pineau²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7188))

Included in the following conference series:

European Workshop on Reinforcement Learning

2308 Accesses
1 Citations

Abstract

We present an algorithmic approach for integrated learning and planning in predictive representations. The approach extends earlier work on predictive state representations to the case of online exploration, by allowing exploration of the domain to proceed in a goal-directed fashion and thus be more efficient. Our algorithm interleaves online learning of the models, with estimation of the value function. The framework is applicable to a variety of important learning problems, including scenarios such as apprenticeship learning, model customization, and decision-making in non-stationary domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning to Solve Sequential Planning Problems Without Rewards

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Offline reinforcement learning with task hierarchies

Article 12 July 2017

References

Aberdeen, D., Buffet, O., Thomas, O.: Policy-gradients for PSRs and POMDPs. In: AISTATS (2007)
Google Scholar
Boots, B., Gordon, G.J.: An online spectral learning algorithm for partially observable nonlinear dynamical systems. In: Proceedings AAAI (2011)
Google Scholar
Boots, B., Siddiqi, S., Gordon, G.: Closing the learning-planning loop with predictive state representations. In: Proceedings of Robotics: Science and Systems (2010)
Google Scholar
Bowling, M., McCracken, P., James, M., Neufeld, J., Wilkinson, D.: Learning predictive state representations using non-blind policies. In: Proceedings ICML (2006)
Google Scholar
Brand, M.: Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and its Applications 415, 20–30 (2006)
Article MathSciNet MATH Google Scholar
Dinculescu, M., Precup, D.: Approximate predictive representations of partially observable systems. In: Proceedings ICML (2010)
Google Scholar
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning (2005)
Google Scholar
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63, 3–42 (2006)
Article MATH Google Scholar
Gordon, G.J.: Approximate Solutions to Markov Decision Processes. Ph.D. thesis, School of Computer Science, Carnegie Mellon University (1999)
Google Scholar
Izadi, M.T., Precup, D.: Point-Based Planning for Predictive State Representations. In: Bergler, S. (ed.) Canadian AI. LNCS (LNAI), vol. 5032, pp. 126–137. Springer, Heidelberg (2008)
Chapter Google Scholar
James, M.R., Wessling, T., Vlassis, N.: Improving approximate value iteration using memories and predictive state representations. In: AAAI (2006)
Google Scholar
James, M.R., Singh, S., Littman, M.L.: Planning with predictive state representations. In: International Conference on Machine Learning and Applications, pp. 304–311 (2004)
Google Scholar
Littman, M., Sutton, R., Singh, S.: Predictive representations of state. In: Advances in Neural Information Processing Systems, NIPS (2002)
Google Scholar
McCallum, A.K.: Reinforcement Learning with Selective Perception and Hidden State. Ph.D. thesis, University of Rochester (1996)
Google Scholar
McCracken, P., Bowling, M.: Online discovery and learning of predictive state representations. In: Neural Information Processing Systems, vol. 18 (2006)
Google Scholar
Nguyen, P., Sunehag, P., Hutter, M.: Feature reinforcement learning in practice. Tech. rep. (2011)
Google Scholar
Poupart, P., Vlassis, N.: Model-based bayesian reinforcement learning in partially observable domains. In: Tenth International Symposium on Artificial Intelligence and Mathematics, ISAIM (2008)
Google Scholar
Rafols, E.J., Ring, M., Sutton, R., Tanner, B.: Using predictive representations to improve generalization in reinforcement learning. In: IJCAI (2005)
Google Scholar
Rosencrantz, M., Gordon, G.J., Thrun, S.: Learning low dimensional predictive representations. In: Proceedings ICML (2004)
Google Scholar
Ross, S., Pineau, J., Chaib-draa, B., Kreitmann, P.: A Bayesian approach for learning and planning in partially observable Markov decision processes. Journal of Machine Learning Research 12, 1655–1696 (2011)
MathSciNet Google Scholar
Singh, S., James, M., Rudary, M.: Predictive state representations: A new theory for modeling dynamical systems. In: Proceedings UAI (2004)
Google Scholar
Soni, V., Singh, S.: Abstraction in predictive state representations. In: AAAI (2007)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press (1998)
Google Scholar
Talvitie, E., Singh, S.: Simple local models for complex dynamical systems. In: Advances in Neural Information Processing Systems, NIPS (2008)
Google Scholar
Veness, J., Ng, K.S., Hutter, M., Uther, W., Silver, D.: A Monte-Carlo AIXI approximation. JAIR 40, 95–142 (2011)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, McGill University, Montreal, Canada
Sylvie C. W. Ong, Yuri Grinberg & Joelle Pineau

Authors

Sylvie C. W. Ong
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Grinberg
View author publications
You can also search for this author in PubMed Google Scholar
Joelle Pineau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA and the Australian National University, 7 London Circuit, ACT 2601, Canberra, Australia
Scott Sanner
Research School of Computer Science, Australian National University, ACT 0200, Canberra, Australia
Marcus Hutter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ong, S.C.W., Grinberg, Y., Pineau, J. (2012). Goal-Directed Online Learning of Predictive Models. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-29946-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29945-2
Online ISBN: 978-3-642-29946-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Goal-Directed Online Learning of Predictive Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Learning to Solve Sequential Planning Problems Without Rewards

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Offline reinforcement learning with task hierarchies

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Goal-Directed Online Learning of Predictive Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Learning to Solve Sequential Planning Problems Without Rewards

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Offline reinforcement learning with task hierarchies

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation