Abstract
This paper presents a parallel reinforcement learning method considered communication cost. In our method, each agent communicates only action sequences with a constant episode interval. As the communication interval is longer, communication cost is smaller, but parallelism is lower. Implementing our method on PC cluster, we investigate such trade-off characteristics. We show that computation time to learning can be reduced by properly adjusting the communication interval.
Preview
Unable to display preview. Download preview PDF.
References
Kretchmar, R.M.: Parallel Reinforcement Learning. In: Proc. of the 6th World Conference on Systemics, Cybernetics and Informatics, vol. 6, pp. 114–118 (2002)
Antonova, D.: Parallel Reinforcement Learning - Extending the Concept to Continuous Multi-State Tasks., thesis, Denison University (2003)
Watkins, C.J.H., Dayan, P.: Technical note: Q-learning. Machine Learning 8, 55–68 (1992)
Grefenstette, J.J.: Credit Assignment in Rule Discovery Systems Based on Genetic Algorithms. Machine Learning 3, 225–245 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fujishiro, T., Nakano, H., Miyauchi, A. (2006). Parallel Distributed Profit Sharing for PC Cluster. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840817_84
Download citation
DOI: https://doi.org/10.1007/11840817_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38625-4
Online ISBN: 978-3-540-38627-8
eBook Packages: Computer ScienceComputer Science (R0)