Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments | SpringerLink
Skip to main content

Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments

  • Conference paper
  • First Online:
Artificial Intelligence (RCAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12948))

Included in the following conference series:

  • 834 Accesses

Abstract

In this paper, we consider the problem of multi-agent navigation in partially observable grid environments. This problem is challenging for centralized planning approaches as they typically rely on full knowledge of the environment. To this end, we suggest utilizing the reinforcement learning approach when the agents first learn the policies that map observations to actions and then follow these policies to reach their goals. To tackle the challenge associated with learning cooperative behavior, i.e. in many cases agents need to yield to each other to accomplish a mission, we use a mixing Q-network that complements learning individual policies. In the experimental evaluation, we show that such approach leads to plausible results and scales well to a large number of agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barer, M., Sharon, G., Stern, R., Felner, A.: Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In: Proceedings of The 7th Annual Symposium on Combinatorial Search (SoCS 2014), pp. 19–27 (July 2014)

    Google Scholar 

  2. Boyarski, E., et al.: ICBS: Improved conflict-based search algorithm for multi-agent pathfinding. In: Proceedings of The 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), pp. 740–746 (2015)

    Google Scholar 

  3. Čáp, M., Novák, P., Kleiner, A., Selecký, M.: Prioritized planning algorithms for trajectory coordination of multiple mobile robots. IEEE Trans. Autom. Sci. Eng. 12(3), 835–849 (2015)

    Article  Google Scholar 

  4. Ha, D., Dai, A., Le, Q.V.: Hypernetworks. In: Proceedings of the International Conference on Learning Representations (2016)

    Google Scholar 

  5. Felner, A., Li, J., Boyarski, E., Ma, H., Cohen, L., Kumar, T. S., Koenig, S.: Adding heuristics to conflict-based search for multi-agent path finding. In: Proceedings of the 28th International Conference on Automated Planning and Scheduling (ICAPS 2018), pp. 83–87 (2018)

    Google Scholar 

  6. Gorodetskiy, A., Shlychkova, A., Panov, A.I.: Delta schema network in model-based reinforcement learning. In: Goertzel, B., Panov, A.I., Potapov, A., Yampolskiy, R. (eds.) AGI 2020. LNCS (LNAI), vol. 12177, pp. 172–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52152-3_18

    Chapter  Google Scholar 

  7. Martinson, M., Skrynnik, A., Panov, A.I.: Navigating autonomous vehicle at the road intersection simulator with reinforcement learning. In: Kuznetsov, S.O., Panov, A.I., Yakovlev, K.S. (eds.) RCAI 2020. LNCS (LNAI), vol. 12412, pp. 71–84. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59535-7_6

    Chapter  Google Scholar 

  8. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  9. Panov, A.I., Yakovlev, K.S., Suvorov, R.: Grid path planning with deep reinforcement learning: preliminary results. Procedia Comput. Sci. 123, 347–353 (2018)

    Article  Google Scholar 

  10. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., Whiteson, S.: Qmix: monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, PMLR, pp. 4295–4304 (2018)

    Google Scholar 

  11. Sartoretti, G., et al.: Primal: pathfinding via reinforcement and imitation multi-agent learning. IEEE Robot. Autom. Lett. 4(3), 2378–2385 (2019)

    Article  Google Scholar 

  12. Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., Silver, D.: Online and Offline Reinforcement Learning by Planning with a Learned Model (2021)

    Google Scholar 

  13. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  14. Sharon, G., Stern, R., Felner, A., Sturtevant., N.R.: Conflict-based search for optimal multiagent path finding. Artif. Intell. J. 218, 40–66 (2015)

    Google Scholar 

  15. Shikunov, M., Panov, A.I.: Hierarchical reinforcement learning approach for the road intersection task. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 495–506. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25719-4_64

    Chapter  Google Scholar 

  16. Sunehag, P., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (2017)

    Google Scholar 

  17. Surynek, P., Felner, A., Stern, R., Boyarski, E.: Efficient sat approach to multi-agent path finding under the sum of costs objective. In: Proceedings of the 22nd European Conference on Artificial Intelligence (ECAI 2016), pp. 810–818. IOS Press (2016)

    Google Scholar 

  18. van den Berg, J., Guy, S.J., Lin, M., Manocha, D.: Reciprocal n-body collision avoidance. In: Pradalier, C., Siegwart, R., Hirzinger, G. (eds.) Robotics Research. Springer Tracts in Advanced Robotics, vol. 70, pp. 3–19. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-19457-3_1

  19. Yakovlev, K., Andreychuk, A., Vorobyev, V.: Prioritized multi-agent path finding for differential drive robots. In: Proceedings of the 2019 European Conference on Mobile Robots (ECMR 2019), IEEE, pp. 1–6 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksandr Panov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Davydov, V., Skrynnik, A., Yakovlev, K., Panov, A. (2021). Q-Mixing Network for Multi-agent Pathfinding in Partially Observable Grid Environments. In: Kovalev, S.M., Kuznetsov, S.O., Panov, A.I. (eds) Artificial Intelligence. RCAI 2021. Lecture Notes in Computer Science(), vol 12948. Springer, Cham. https://doi.org/10.1007/978-3-030-86855-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86855-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86854-3

  • Online ISBN: 978-3-030-86855-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics