Abstract
We consider linear fixed point equations and their approximations by projection on a low dimensional subspace. We derive new bounds on the approximation error of the solution, which are expressed in terms of low dimensional matrices and can be computed by simulation. When the fixed point mapping is a contraction, as is typically the case in Markovian decision processes (MDP), one of our bounds is always sharper than the standard worst case bounds, and another one is often sharper. Our bounds also apply to the non-contraction case, including policy evaluation in MDP with nonstandard projections that enhance exploration. There are no error bounds currently available for this case to our knowledge.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn., vol. II. Athena Scientific, Belmont (2007)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
Bertsekas, D.P., Yu, H.: Projected equation methods for approximate solution of large linear systems. J. Computational and Applied Mathematics (to appear, 2008)
Boyan, J.A.: Least-squares temporal difference learning. In: Proc. of the 16th Int. Conf. Machine Learning (1999)
Konda, V.R.: Actor-Critic Algorithms. Ph.D thesis. MIT, Cambridge (2002)
Munos, R.: Error bounds for approximate policy iteration. In: Proc. The 20th Int. Conf. Machine Learning (2003)
Nedić, A., Bertsekas, D.P.: Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dyn. Syst. 13, 79–110 (2003)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)
Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)
Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Trans. Automat. Contr. 42(5), 674–690 (1997)
Tsitsiklis, J.N., Van Roy, B.: Average cost temporal-difference learning. Automatica 35(11), 1799–1808 (1999)
Yu, H., Bertsekas, D.P.: New error bounds for approximations from projected linear equations. Technical Report C-2008-43, University of Helsinki (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, H., Bertsekas, D.P. (2008). New Error Bounds for Approximations from Projected Linear Equations. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds) Recent Advances in Reinforcement Learning. EWRL 2008. Lecture Notes in Computer Science(), vol 5323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89722-4_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-89722-4_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89721-7
Online ISBN: 978-3-540-89722-4
eBook Packages: Computer ScienceComputer Science (R0)