VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning

Bettini, Matteo; Kortvelesy, Ryan; Blumenkamp, Jan; Prorok, Amanda

doi:10.1007/978-3-031-51497-5_4

Matteo Bettini²¹,
Ryan Kortvelesy²¹,
Jan Blumenkamp²¹ &
…
Amanda Prorok²¹

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 28))

Included in the following conference series:

International Symposium on Distributed Autonomous Robotic Systems

546 Accesses
4 Citations

Abstract

While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE’s execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10 s, proving more than 100\(\times \) faster. Using VMAS’s RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS’s scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at: https://github.com/proroklab/VectorizedMultiAgentSimulator. A video of VMAS scenarios and experiments is available https://youtu.be/aaDRYfiesAY

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 22879; Price includes VAT (Japan)

Softcover Book: JPY 28599; Price includes VAT (Japan)

Hardcover Book: JPY 28599; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Distributed Reinforcement Learning for Robot Teams: a Review

Article 01 September 2022

MAP3F: a decentralized approach to multi-agent pathfinding and collision avoidance with scalable 1D, 2D, and 3D feature fusion

Article 22 April 2024

Consensus-Based ADMM for Task Assignment in Multi-robot Teams

Notes

1.
Here we illustrate an on-policy training iteration, but simulation is a key component of any type of MARL algorithm.
2.
The episode reward mean is the mean of the total rewards of episodes contained in the training iteration.

References

Pyglet. https://pyglet.org/
Baker, B., et al.: Emergent tool use from multi-agent autocurricula. In: International Conference on Learning Representations (2019)
Google Scholar
Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of markov decision processes. Math. Oper. Res. 27(4), 819–840 (2002)
Article MathSciNet Google Scholar
Blumenkamp, J., Morad, S., Gielis, J., Li, Q., Prorok, A.: A framework for real-world multi-robot systems running decentralized gnn-based policies. arXiv preprint arXiv:2111.01777 (2021)
Bradbury, J., et al.: JAX: composable transformations of Python+NumPy programs (2018). http://github.com/google/jax
Bräysy, O., Gendreau, M.: Vehicle routing problem with time windows, Part II: Metaheuristics. Transp. Sci. 39(1), 119–139 (2005)
Article Google Scholar
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Freeman, C.D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., Bachem, O.: Brax - a differentiable physics engine for large scale rigid body simulation (2021). http://github.com/google/brax
Ijspeert, A.J., Martinoli, A., Billard, A., Gambardella, L.M.: Collaboration through the exploitation of local interactions in autonomous collective robotics: the stick pulling experiment. Auton. Robot. 11(2), 149–171 (2001)
Article Google Scholar
Jiang, S., Amato, C.: Multi-agent reinforcement learning with directed exploration and selective memory reuse. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, pp. 777–784 (2021)
Google Scholar
Koenig, N., Howard, A.: Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), pp. 2149–2154. IEEE (2004)
Google Scholar
Kurach, K., et al.: Google research football: a novel reinforcement learning environment. Proc. AAAI Conf. Artif. Intell. 34, 4501–4510 (2020)
Google Scholar
Lange, R.T.: gymnax: A JAX-based reinforcement learning environment library (2022). http://github.com/RobertTLange/gymnax
Li, Q., Gama, F., Ribeiro, A., Prorok, A.: Graph neural networks for decentralized multi-robot path planning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11785–11792. IEEE (2020)
Google Scholar
Liang, E., et al.: Rllib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning, pp. 3053–3062. PMLR (2018)
Google Scholar
Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. Adv. Neural Inf. Process. Syst. (2017)
Google Scholar
Makoviychuk, V., et al.: Isaac gym: High performance GPU based physics simulation for robot learning. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)
Google Scholar
Michel, O.: Cyberbotics ltd. webots\(^{\rm TM}\): professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1, 5 (2004)
Google Scholar
Niiranen, J.: Fast and accurate symmetric Euler algorithm for electromechanical simulations. In: Electrimacs 99 (Modelling and Simulation of Electric Machines Converters an & Systems), pp. I–71 (D1999)
Google Scholar
Noori, F.M., Portugal, D., Rocha, R.P., Couceiro, M.S.: On 3d simulators for multi-robot systems in ros: Morse or gazebo? In: 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp. 19–24 (2017)
Google Scholar
Panerati, J., Zheng, H., Zhou, S., Xu, J., Prorok, A., Schoellig, A.P.: Learning to fly-a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7512–7519. IEEE (2021)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. (2019)
Google Scholar
Peng, B., et al.: Facmac: factored multi-agent centralised policy gradients. Adv. Neural Inf. Process. Syst. 12208–12221 (2021)
Google Scholar
Pinciroli, C., et al.: ARGoS: a modular, parallel, multi-engine simulator for multi-robot systems. Swarm Intell. 6, 271–295 (2012)
Google Scholar
Prorok, A.: Robust assignment using redundant robots on transport networks with uncertain travel time. IEEE Trans. Automat. Sci. Eng. 17, 2025–2037 (2020)
Google Scholar
Resnick, C., et al.: Pommerman: a multi-agent playground. CoRR (2018)
Google Scholar
Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. In: Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34 (1987)
Google Scholar
Samvelyan, M., et al.: The StarCraft Multi-agent Challenge. CoRR (2019)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Shen, J., Xiao, E., Liu, Y., Feng, C.: A deep reinforcement learning environment for particle robot navigation and object manipulation. arXiv preprint arXiv:2203.06464 (2022)
Suarez, J., Du, Y., Isola, P., Mordatch, I.: Neural MMO: a massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint arXiv:1903.00784 (2019)
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. IEEE (2012)
Google Scholar
Tolstaya, E., Gama, F., Paulos, J., Pappas, G., Kumar, V., Ribeiro, A.: Learning decentralized controllers for robot swarms with graph neural networks. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning, Proceedings of Machine Learning Research, vol. 100, pp. 671–682. PMLR (2020). https://proceedings.mlr.press/v100/tolstaya20a.html
Wang, B., Liu, Z., Li, Q., Prorok, A.: Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robot. Automat. Lett. 5, 6932–6939 (2020)
Google Scholar
Weng, J., et al.: Envpool: a highly parallel reinforcement learning environment execution engine. arXiv preprint arXiv:2206.10558 (2022)
de Witt, C.S., et al.: Is independent learning all you need in the starcraft multi-agent challenge? arXiv preprint arXiv:2011.09533 (2020)
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)
Zheng, L., et al.: Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Zheng, X., Koenig, S., Kempe, D., Jain, S.: Multirobot forest coverage for weighted and unweighted terrain. IEEE Trans. Robot. 26, 1018–1031 (2010)
Google Scholar

Download references

Acknowledgements

This work was supported by ARL DCIST CRA W911NF-17-2-0181 and European Research Council (ERC) Project 949940 (gAIa). R. Kortvelesy was supported by Nokia Bell Labs through their donation for the Centre of Mobile, Wearable Systems and Augmented Intelligence to the University of Cambridge. J. Blumenkamp acknowledges the support of the ‘Studienstiftung des deutschen Volkes’ and an EPSRC tuition fee grant.

Author information

Authors and Affiliations

Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
Matteo Bettini, Ryan Kortvelesy, Jan Blumenkamp & Amanda Prorok

Authors

Matteo Bettini
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Kortvelesy
View author publications
You can also search for this author in PubMed Google Scholar
Jan Blumenkamp
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Prorok
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Bettini .

Editor information

Editors and Affiliations

University of Franche-Comté, Montbéliard, France
Julien Bourgeois
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Jamie Paik
CNRS, University of Franche-Comté, Besancon, France
Benoît Piranda
Harvard University, Boston, MA, USA
Justin Werfel
University of Bristol, Bristol, UK
Sabine Hauert
Boston University, Boston, MA, USA
Alyssa Pierson
University of Lübeck, Lubeck, Germany
Heiko Hamann
The Chinese University of Hong Kong, Shenzhen, Shenzhen, China
Tin Lun Lam
Kyoto University, Kyoto, Japan
Fumitoshi Matsuno
University of Illinois System, Urbana, IL, USA
Negar Mehr
University of Franche-Comté, Montbéliard, France
Abdallah Makhoul

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bettini, M., Kortvelesy, R., Blumenkamp, J., Prorok, A. (2024). VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning. In: Bourgeois, J., et al. Distributed Autonomous Robotic Systems. DARS 2022. Springer Proceedings in Advanced Robotics, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-031-51497-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-51497-5_4
Published: 01 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51496-8
Online ISBN: 978-3-031-51497-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning