Abstract
While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE’s execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10 s, proving more than 100\(\times \) faster. Using VMAS’s RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS’s scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at: https://github.com/proroklab/VectorizedMultiAgentSimulator. A video of VMAS scenarios and experiments is available https://youtu.be/aaDRYfiesAY
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Here we illustrate an on-policy training iteration, but simulation is a key component of any type of MARL algorithm.
- 2.
The episode reward mean is the mean of the total rewards of episodes contained in the training iteration.
References
Pyglet. https://pyglet.org/
Baker, B., et al.: Emergent tool use from multi-agent autocurricula. In: International Conference on Learning Representations (2019)
Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of markov decision processes. Math. Oper. Res. 27(4), 819–840 (2002)
Blumenkamp, J., Morad, S., Gielis, J., Li, Q., Prorok, A.: A framework for real-world multi-robot systems running decentralized gnn-based policies. arXiv preprint arXiv:2111.01777 (2021)
Bradbury, J., et al.: JAX: composable transformations of Python+NumPy programs (2018). http://github.com/google/jax
Bräysy, O., Gendreau, M.: Vehicle routing problem with time windows, Part II: Metaheuristics. Transp. Sci. 39(1), 119–139 (2005)
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Freeman, C.D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., Bachem, O.: Brax - a differentiable physics engine for large scale rigid body simulation (2021). http://github.com/google/brax
Ijspeert, A.J., Martinoli, A., Billard, A., Gambardella, L.M.: Collaboration through the exploitation of local interactions in autonomous collective robotics: the stick pulling experiment. Auton. Robot. 11(2), 149–171 (2001)
Jiang, S., Amato, C.: Multi-agent reinforcement learning with directed exploration and selective memory reuse. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, pp. 777–784 (2021)
Koenig, N., Howard, A.: Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), pp. 2149–2154. IEEE (2004)
Kurach, K., et al.: Google research football: a novel reinforcement learning environment. Proc. AAAI Conf. Artif. Intell. 34, 4501–4510 (2020)
Lange, R.T.: gymnax: A JAX-based reinforcement learning environment library (2022). http://github.com/RobertTLange/gymnax
Li, Q., Gama, F., Ribeiro, A., Prorok, A.: Graph neural networks for decentralized multi-robot path planning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11785–11792. IEEE (2020)
Liang, E., et al.: Rllib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning, pp. 3053–3062. PMLR (2018)
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. Adv. Neural Inf. Process. Syst. (2017)
Makoviychuk, V., et al.: Isaac gym: High performance GPU based physics simulation for robot learning. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)
Michel, O.: Cyberbotics ltd. webots\(^{\rm TM}\): professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1, 5 (2004)
Niiranen, J.: Fast and accurate symmetric Euler algorithm for electromechanical simulations. In: Electrimacs 99 (Modelling and Simulation of Electric Machines Converters an & Systems), pp. I–71 (D1999)
Noori, F.M., Portugal, D., Rocha, R.P., Couceiro, M.S.: On 3d simulators for multi-robot systems in ros: Morse or gazebo? In: 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp. 19–24 (2017)
Panerati, J., Zheng, H., Zhou, S., Xu, J., Prorok, A., Schoellig, A.P.: Learning to fly-a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7512–7519. IEEE (2021)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. (2019)
Peng, B., et al.: Facmac: factored multi-agent centralised policy gradients. Adv. Neural Inf. Process. Syst. 12208–12221 (2021)
Pinciroli, C., et al.: ARGoS: a modular, parallel, multi-engine simulator for multi-robot systems. Swarm Intell. 6, 271–295 (2012)
Prorok, A.: Robust assignment using redundant robots on transport networks with uncertain travel time. IEEE Trans. Automat. Sci. Eng. 17, 2025–2037 (2020)
Resnick, C., et al.: Pommerman: a multi-agent playground. CoRR (2018)
Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. In: Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34 (1987)
Samvelyan, M., et al.: The StarCraft Multi-agent Challenge. CoRR (2019)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Shen, J., Xiao, E., Liu, Y., Feng, C.: A deep reinforcement learning environment for particle robot navigation and object manipulation. arXiv preprint arXiv:2203.06464 (2022)
Suarez, J., Du, Y., Isola, P., Mordatch, I.: Neural MMO: a massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint arXiv:1903.00784 (2019)
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. IEEE (2012)
Tolstaya, E., Gama, F., Paulos, J., Pappas, G., Kumar, V., Ribeiro, A.: Learning decentralized controllers for robot swarms with graph neural networks. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning, Proceedings of Machine Learning Research, vol. 100, pp. 671–682. PMLR (2020). https://proceedings.mlr.press/v100/tolstaya20a.html
Wang, B., Liu, Z., Li, Q., Prorok, A.: Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robot. Automat. Lett. 5, 6932–6939 (2020)
Weng, J., et al.: Envpool: a highly parallel reinforcement learning environment execution engine. arXiv preprint arXiv:2206.10558 (2022)
de Witt, C.S., et al.: Is independent learning all you need in the starcraft multi-agent challenge? arXiv preprint arXiv:2011.09533 (2020)
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)
Zheng, L., et al.: Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Zheng, X., Koenig, S., Kempe, D., Jain, S.: Multirobot forest coverage for weighted and unweighted terrain. IEEE Trans. Robot. 26, 1018–1031 (2010)
Acknowledgements
This work was supported by ARL DCIST CRA W911NF-17-2-0181 and European Research Council (ERC) Project 949940 (gAIa). R. Kortvelesy was supported by Nokia Bell Labs through their donation for the Centre of Mobile, Wearable Systems and Augmented Intelligence to the University of Cambridge. J. Blumenkamp acknowledges the support of the ‘Studienstiftung des deutschen Volkes’ and an EPSRC tuition fee grant.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bettini, M., Kortvelesy, R., Blumenkamp, J., Prorok, A. (2024). VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning. In: Bourgeois, J., et al. Distributed Autonomous Robotic Systems. DARS 2022. Springer Proceedings in Advanced Robotics, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-031-51497-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-51497-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51496-8
Online ISBN: 978-3-031-51497-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)