Bayesian Reward Filtering

Geist, Matthieu; Pietquin, Olivier; Fricout, Gabriel

doi:10.1007/978-3-540-89722-4_8

Matthieu Geist^3,4,
Olivier Pietquin³ &
Gabriel Fricout⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5323))

Included in the following conference series:

European Workshop on Reinforcement Learning

1140 Accesses

Abstract

A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other fields such as neural network training, have been little studied. We propose a general Bayesian filtering framework for reinforcement learning, as well as a specific implementation based on sigma point Kalman filtering and kernel machines. This allows us to derive an efficient off-policy model-free approximate temporal differences algorithm which will be demonstrated on two simple benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

Article 03 July 2018

Sparse Approximations to Value Functions in Reinforcement Learning

Reinforcement Learning for Control Using Value Function Approximation

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 3rd edn. The MIT Press, Cambridge (1998)
Google Scholar
Chen, Z.: Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond. Technical report, Adaptive Systems Lab, McMaster University (2003)
Google Scholar
Bellman, R.: Dynamic Programming, 6th edn. Dover Publications (1957)
Google Scholar
Engel, Y.: Algorithms and Representations for Reinforcement Learning. Ph.D thesis, Hebrew University (April 2005)
Google Scholar
van der Merwe, R.: Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models. Ph.D thesis, OGI School of Science & Engineering, Oregon Health & Science University, Portland, OR, USA (April 2004)
Google Scholar
Szita, I., Lőrincz, A.: Kalman Filter Control Embedded into the Reinforcement Learning Framework. Neural Comput. 16(3), 491–499 (2004)
Article MATH Google Scholar
Phua, C.W., Fitch, R.: Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation. In: ICML 2007 (2007)
Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific (1995)
Google Scholar
Vapnik, V.N.: Statisical Learning Theory. John Wiley & Sons, Inc., Chichester (1998)
Google Scholar
Carreira-Perpinan, M.A.: Mode-Finding for Mixtures of Gaussian Distributions. IEEE Transactions on Pattern Analalysis and Machine Intelligence 22(11), 1318–1323 (2000)
Article Google Scholar
Schneegass, D., Udluft, S., Martinetz, T.: Kernel Rewards Regression: an Information Efficient Batch Policy Iteration Approach. In: AIA 2006: Proceedings of the 24th IASTED international conference on Artificial intelligence and applications, Anaheim, CA, USA, pp. 428–433. ACTA Press (2006)
Google Scholar
Dearden, R., Friedman, N., Russell, S.J.: Bayesian Q-learning. In: Fifteenth National Conference on Artificial Intelligence, pp. 761–768 (1998)
Google Scholar
Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC Model-Free Reinforcement Learning. In: 23rd International Conference on Machine Learning (ICML 2006), Pittsburgh, PA, USA, pp. 881–888 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Supélec, IMS Research Group, Metz, France
Matthieu Geist & Olivier Pietquin
MCE Department, ArcelorMittal Research, Maizières-lès-Metz, France
Matthieu Geist & Gabriel Fricout

Authors

Matthieu Geist
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Pietquin
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Fricout
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Lille-Nord Europe, 59650, Villeneuve d’Ascq, France
Sertan Girgin
INRIA, LIFL, CNRS, Université de Lille, Villeneuve d’Ascq, France
Manuel Loth , Rémi Munos , Philippe Preux & Daniil Ryabko , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geist, M., Pietquin, O., Fricout, G. (2008). Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-89722-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89721-7
Online ISBN: 978-3-540-89722-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bayesian Reward Filtering

Abstract

Access this chapter

Preview

Similar content being viewed by others

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

Sparse Approximations to Value Functions in Reinforcement Learning

Reinforcement Learning for Control Using Value Function Approximation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Bayesian Reward Filtering

Abstract

Access this chapter

Preview

Similar content being viewed by others

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method

Sparse Approximations to Value Functions in Reinforcement Learning

Reinforcement Learning for Control Using Value Function Approximation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation