VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning | SpringerLink
Skip to main content

VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning

  • Conference paper
  • First Online:
Distributed Autonomous Robotic Systems (DARS 2022)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 28))

Included in the following conference series:

Abstract

While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE’s execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10 s, proving more than 100\(\times \) faster. Using VMAS’s RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS’s scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at: https://github.com/proroklab/VectorizedMultiAgentSimulator. A video of VMAS scenarios and experiments is available https://youtu.be/aaDRYfiesAY

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 22879
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 28599
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 28599
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Here we illustrate an on-policy training iteration, but simulation is a key component of any type of MARL algorithm.

  2. 2.

    The episode reward mean is the mean of the total rewards of episodes contained in the training iteration.

References

  1. Pyglet. https://pyglet.org/

  2. Baker, B., et al.: Emergent tool use from multi-agent autocurricula. In: International Conference on Learning Representations (2019)

    Google Scholar 

  3. Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of decentralized control of markov decision processes. Math. Oper. Res. 27(4), 819–840 (2002)

    Article  MathSciNet  Google Scholar 

  4. Blumenkamp, J., Morad, S., Gielis, J., Li, Q., Prorok, A.: A framework for real-world multi-robot systems running decentralized gnn-based policies. arXiv preprint arXiv:2111.01777 (2021)

  5. Bradbury, J., et al.: JAX: composable transformations of Python+NumPy programs (2018). http://github.com/google/jax

  6. Bräysy, O., Gendreau, M.: Vehicle routing problem with time windows, Part II: Metaheuristics. Transp. Sci. 39(1), 119–139 (2005)

    Article  Google Scholar 

  7. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

  8. Freeman, C.D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., Bachem, O.: Brax - a differentiable physics engine for large scale rigid body simulation (2021). http://github.com/google/brax

  9. Ijspeert, A.J., Martinoli, A., Billard, A., Gambardella, L.M.: Collaboration through the exploitation of local interactions in autonomous collective robotics: the stick pulling experiment. Auton. Robot. 11(2), 149–171 (2001)

    Article  Google Scholar 

  10. Jiang, S., Amato, C.: Multi-agent reinforcement learning with directed exploration and selective memory reuse. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, pp. 777–784 (2021)

    Google Scholar 

  11. Koenig, N., Howard, A.: Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), pp. 2149–2154. IEEE (2004)

    Google Scholar 

  12. Kurach, K., et al.: Google research football: a novel reinforcement learning environment. Proc. AAAI Conf. Artif. Intell. 34, 4501–4510 (2020)

    Google Scholar 

  13. Lange, R.T.: gymnax: A JAX-based reinforcement learning environment library (2022). http://github.com/RobertTLange/gymnax

  14. Li, Q., Gama, F., Ribeiro, A., Prorok, A.: Graph neural networks for decentralized multi-robot path planning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11785–11792. IEEE (2020)

    Google Scholar 

  15. Liang, E., et al.: Rllib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning, pp. 3053–3062. PMLR (2018)

    Google Scholar 

  16. Lowe, R., Wu, Y.I., Tamar, A., Harb, J., Pieter Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. Adv. Neural Inf. Process. Syst. (2017)

    Google Scholar 

  17. Makoviychuk, V., et al.: Isaac gym: High performance GPU based physics simulation for robot learning. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021)

    Google Scholar 

  18. Michel, O.: Cyberbotics ltd. webots\(^{\rm TM}\): professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1, 5 (2004)

    Google Scholar 

  19. Niiranen, J.: Fast and accurate symmetric Euler algorithm for electromechanical simulations. In: Electrimacs 99 (Modelling and Simulation of Electric Machines Converters an & Systems), pp. I–71 (D1999)

    Google Scholar 

  20. Noori, F.M., Portugal, D., Rocha, R.P., Couceiro, M.S.: On 3d simulators for multi-robot systems in ros: Morse or gazebo? In: 2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR), pp. 19–24 (2017)

    Google Scholar 

  21. Panerati, J., Zheng, H., Zhou, S., Xu, J., Prorok, A., Schoellig, A.P.: Learning to fly-a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7512–7519. IEEE (2021)

    Google Scholar 

  22. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. (2019)

    Google Scholar 

  23. Peng, B., et al.: Facmac: factored multi-agent centralised policy gradients. Adv. Neural Inf. Process. Syst. 12208–12221 (2021)

    Google Scholar 

  24. Pinciroli, C., et al.: ARGoS: a modular, parallel, multi-engine simulator for multi-robot systems. Swarm Intell. 6, 271–295 (2012)

    Google Scholar 

  25. Prorok, A.: Robust assignment using redundant robots on transport networks with uncertain travel time. IEEE Trans. Automat. Sci. Eng. 17, 2025–2037 (2020)

    Google Scholar 

  26. Resnick, C., et al.: Pommerman: a multi-agent playground. CoRR (2018)

    Google Scholar 

  27. Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. In: Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34 (1987)

    Google Scholar 

  28. Samvelyan, M., et al.: The StarCraft Multi-agent Challenge. CoRR (2019)

    Google Scholar 

  29. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  30. Shen, J., Xiao, E., Liu, Y., Feng, C.: A deep reinforcement learning environment for particle robot navigation and object manipulation. arXiv preprint arXiv:2203.06464 (2022)

  31. Suarez, J., Du, Y., Isola, P., Mordatch, I.: Neural MMO: a massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint arXiv:1903.00784 (2019)

  32. Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033. IEEE (2012)

    Google Scholar 

  33. Tolstaya, E., Gama, F., Paulos, J., Pappas, G., Kumar, V., Ribeiro, A.: Learning decentralized controllers for robot swarms with graph neural networks. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning, Proceedings of Machine Learning Research, vol. 100, pp. 671–682. PMLR (2020). https://proceedings.mlr.press/v100/tolstaya20a.html

  34. Wang, B., Liu, Z., Li, Q., Prorok, A.: Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robot. Automat. Lett. 5, 6932–6939 (2020)

    Google Scholar 

  35. Weng, J., et al.: Envpool: a highly parallel reinforcement learning environment execution engine. arXiv preprint arXiv:2206.10558 (2022)

  36. de Witt, C.S., et al.: Is independent learning all you need in the starcraft multi-agent challenge? arXiv preprint arXiv:2011.09533 (2020)

  37. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)

  38. Zheng, L., et al.: Magent: a many-agent reinforcement learning platform for artificial collective intelligence. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  39. Zheng, X., Koenig, S., Kempe, D., Jain, S.: Multirobot forest coverage for weighted and unweighted terrain. IEEE Trans. Robot. 26, 1018–1031 (2010)

    Google Scholar 

Download references

Acknowledgements

This work was supported by ARL DCIST CRA W911NF-17-2-0181 and European Research Council (ERC) Project 949940 (gAIa). R. Kortvelesy was supported by Nokia Bell Labs through their donation for the Centre of Mobile, Wearable Systems and Augmented Intelligence to the University of Cambridge. J. Blumenkamp acknowledges the support of the ‘Studienstiftung des deutschen Volkes’ and an EPSRC tuition fee grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Bettini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bettini, M., Kortvelesy, R., Blumenkamp, J., Prorok, A. (2024). VMAS: A Vectorized Multi-agent Simulator for Collective Robot Learning. In: Bourgeois, J., et al. Distributed Autonomous Robotic Systems. DARS 2022. Springer Proceedings in Advanced Robotics, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-031-51497-5_4

Download citation

Publish with us

Policies and ethics