Abstract
This paper compares and investigates single-agent reinforcement learning (RL) algorithms on the simple and an extended taxi problem domain, and multiagent RL algorithms on a multiagent extension of the simple taxi problem domain we created. In particular, we extend the Policy Hill Climbing (PHC) and the Win or Learn Fast-PHC (WoLF-PHC) algorithms by combining them with the MAXQ hierarchical decomposition and investigate their efficiency. The results are very promising for the multiagent domain as they indicate that these two newly-created algorithms are the most efficient ones from the algorithms we compared.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andre, D., Russell, S.J.: In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems 13 (NIPS 2000), pp. 1019–1025. MIT Press, Cambridge (2001)
Bowling, M.H., Veloso, M.M.: Artificial Intelligence 136(2), 215–250 (2002)
Dayan, P., Hinton, G.E.: In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems 5 (NIPS 1992), pp. 271–278. Morgan Kaufmann, San Francisco (1993)
Dietterich, T.G.: Journal of Artificial Intelligence Research 13, 227–303 (2000)
Diuk, C., Cohen, A., Littman, M.L.: In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 240–247. ACM, New York (2008)
Fitch, R., Hengst, B., Šuc, D., Calbert, G., Scholz, J.: Structural Abstraction Experiments in Reinforcement Learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 164–175. Springer, Heidelberg (2005)
Ghavamzadeh, M., Mahadevan, S.: In: Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), vol. 3, pp. 1114–1121. IEEE Computer Society, Washington, DC (2004)
Hengst, B.: In: Proceedings of the 19th International Conference on Machine Learning (ICML 2002), pp. 243–250. Morgan Kaufmann, San Francisco (2002)
Hwang, K.-S., Lin, C.-J., Wu, C.-J., Lo, C.-Y.: Cooperation Between Multiple Agents Based on Partially Sharing Policy. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. LNCS, vol. 4681, pp. 422–432. Springer, Heidelberg (2007)
Kaelbling, L.P.: In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 167–173. Morgan Kaufmann, San Francisco (1993)
Mehta, N., Tadepalli, P., Fern, A.: In: Driessens, K., Fern, A., van Otterlo, M. (eds.) Proceedings of the ICML 2005 Workshop on Rich Representations for Reinforcement Learning, Bonn, Germany, pp. 45–50 (2005)
Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 648–655. ACM, New York (2008)
Mirzazadeh, F., Behsaz, B., Beigy, H.: In: Proceedings of the International Conference on Information and Communication Technology (ICICT 2007), pp. 105–108 (2007)
Parr, R.: Hierarchical control and learning for Markov decision processes. Ph.D. thesis, University of California at Berkeley (1998)
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Tech. Rep. CUED/F-INFENG/TR 166, Cambridge University (1994)
Shen, J., Liu, H., Gu, G.: In: Yao, Y., Shi, Z., Wang, Y., Kinsner, W. (eds.) Proceedings of the 5th International Conference on Cognitive Informatics (ICCI 2006), pp. 584–588. IEEE (2006)
Singh, S.P.: Machine Learning 8, 323–339 (1992)
Sutton, R.S., Precup, D., Singh, S.: Artificial Intelligence 112, 181–211 (1999)
Thrun, S.B.: Efficient Exploration in Reinforcement Learning. Tech. Rep. CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA (1992)
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
Wiering, M., Schmidhuber, J.: Adaptive Behavior 6, 219–246 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lambrou, I., Vassiliades, V., Christodoulou, C. (2012). An Extension of a Hierarchical Reinforcement Learning Algorithm for Multiagent Settings. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-29946-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29945-2
Online ISBN: 978-3-642-29946-9
eBook Packages: Computer ScienceComputer Science (R0)