Abstract
A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other fields such as neural network training, have been little studied. We propose a general Bayesian filtering framework for reinforcement learning, as well as a specific implementation based on sigma point Kalman filtering and kernel machines. This allows us to derive an efficient off-policy model-free approximate temporal differences algorithm which will be demonstrated on two simple benchmarks.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 3rd edn. The MIT Press, Cambridge (1998)
Chen, Z.: Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond. Technical report, Adaptive Systems Lab, McMaster University (2003)
Bellman, R.: Dynamic Programming, 6th edn. Dover Publications (1957)
Engel, Y.: Algorithms and Representations for Reinforcement Learning. Ph.D thesis, Hebrew University (April 2005)
van der Merwe, R.: Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models. Ph.D thesis, OGI School of Science & Engineering, Oregon Health & Science University, Portland, OR, USA (April 2004)
Szita, I., Lőrincz, A.: Kalman Filter Control Embedded into the Reinforcement Learning Framework. Neural Comput. 16(3), 491–499 (2004)
Phua, C.W., Fitch, R.: Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation. In: ICML 2007 (2007)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific (1995)
Vapnik, V.N.: Statisical Learning Theory. John Wiley & Sons, Inc., Chichester (1998)
Carreira-Perpinan, M.A.: Mode-Finding for Mixtures of Gaussian Distributions. IEEE Transactions on Pattern Analalysis and Machine Intelligence 22(11), 1318–1323 (2000)
Schneegass, D., Udluft, S., Martinetz, T.: Kernel Rewards Regression: an Information Efficient Batch Policy Iteration Approach. In: AIA 2006: Proceedings of the 24th IASTED international conference on Artificial intelligence and applications, Anaheim, CA, USA, pp. 428–433. ACTA Press (2006)
Dearden, R., Friedman, N., Russell, S.J.: Bayesian Q-learning. In: Fifteenth National Conference on Artificial Intelligence, pp. 761–768 (1998)
Strehl, A.L., Li, L., Wiewiora, E., Langford, J., Littman, M.L.: PAC Model-Free Reinforcement Learning. In: 23rd International Conference on Machine Learning (ICML 2006), Pittsburgh, PA, USA, pp. 881–888 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geist, M., Pietquin, O., Fricout, G. (2008). Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-89722-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89721-7
Online ISBN: 978-3-540-89722-4
eBook Packages: Computer ScienceComputer Science (R0)