Safe Multi-agent Reinforcement Learning for Drone Routing Problems

Kaji, Masahiro; Lin, Donghui; Uwano, Fumito

doi:10.1007/978-3-031-77367-9_25

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15395))

Included in the following conference series:

International Conference on Principles and Practice of Multi-Agent Systems

213 Accesses

Abstract

Drone Routing Problems (DRP) focus on finding optimal paths for autonomous drones in a graph-based environment, minimizing movement costs and avoiding collisions. DRP is modeled as a cooperative multi-agent problem, for which Multi-Agent Reinforcement Learning (MARL) offers a promising solution. However, MARL struggles with collision avoidance through trial and error and cannot guarantee collision-free operations. This paper proposes a safety control method for MARL, modifying unsafe actions by stopping those with high collision risks and allowing agents to yield routes. We implement Safe QMIX by integrating a safety control mechanism into QMIX and demonstrate its effectiveness through experimental evaluation, achieving lower collision rates and improved pathfinding efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8465; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the real world, drones are assumed to have a dimension of altitude. We focus on the safety issue in learning processes, and therefore use the 2D graph for simplicity.
2.
The information of the node location and edge lengths is provided at the following site: https://github.com/DrpChallenge/main/tree/main/drp_env/map.

References

Ding, S., Aoyama, H., Lin, D.: MARL\(_{4}{DRP}\): benchmarking cooperative multi-agent reinforcement learning algorithms for drone routing problems. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds.) PRICAI 2023. LNCS, vol. 14327, pp. 459–465. Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-7025-4_40
Chapter Google Scholar
ElSayed-Aly, I., Bharadwaj, S., Amato, C., Ehlers, R., Topcu, U., Feng, L.: Safe multi-agent reinforcement learning via shielding. In: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, pp. 483–491 (2021)
Google Scholar
Gu, S., et al.: A review of safe reinforcement learning: methods, theory and applications. arXiv preprint arXiv:2205.10330 (2022)
Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)
Article Google Scholar
Oliehoek, F.A., Amato, C., et al.: A Concise Introduction to Decentralized POMDPs, vol. 1. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-28929-8
Book Google Scholar
Papoudakis, G., Christianos, F., Schäfer, L., Albrecht, S.V.: Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS) (2021)
Google Scholar
Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 21(1), 7234–7284 (2020)
MathSciNet Google Scholar

Download references

Acknowledgements

This research was partially supported by a Grant-in-Aid for Scientific Research (B) (24K03001, 2024–2027) from the Japan Society for the Promotion of Science.

Author information

Authors and Affiliations

Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama, Japan
Masahiro Kaji, Donghui Lin & Fumito Uwano

Authors

Masahiro Kaji
View author publications
You can also search for this author in PubMed Google Scholar
Donghui Lin
View author publications
You can also search for this author in PubMed Google Scholar
Fumito Uwano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Donghui Lin .

Editor information

Editors and Affiliations

Kyoto University, Kyoto, Japan
Ryuta Arisaka
Universitat Politècnica de València, Valencia, Sevilla, Spain
Victor Sanchez-Anguix
University of Southampton, Southampton, Hampshire, UK
Sebastian Stein
Özyeğin University, Istanbul, Türkiye
Reyhan Aydoğan
University of Luxembourg, Esch-sur-Alzette, Luxembourg
Leon van der Torre
Kyoto University, Kyoto, Kyoto, Japan
Takayuki Ito

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaji, M., Lin, D., Uwano, F. (2025). Safe Multi-agent Reinforcement Learning for Drone Routing Problems. In: Arisaka, R., Sanchez-Anguix, V., Stein, S., Aydoğan, R., van der Torre, L., Ito, T. (eds) PRIMA 2024: Principles and Practice of Multi-Agent Systems. PRIMA 2024. Lecture Notes in Computer Science(), vol 15395. Springer, Cham. https://doi.org/10.1007/978-3-031-77367-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-77367-9_25
Published: 15 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-77366-2
Online ISBN: 978-3-031-77367-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Safe Multi-agent Reinforcement Learning for Drone Routing Problems