Privacy preserving planning in multi-agent stochastic environments

Hefner, Tommy; Shani, Guy; Stern, Roni

doi:10.1007/s10458-022-09554-w

Privacy preserving planning in multi-agent stochastic environments

Published: 26 March 2022

Volume 36, article number 22, (2022)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

423 Accesses
1 Citation
Explore all metrics

Abstract

Collaborative privacy preserving planning (cppp) gained much attention in the past decade. cppp aims to create solutions for multi agent planning problems where cooperation is required to achieve an efficient solution, without exposing information that the agent considers private in the process. To date, cppp has focused on domains with deterministic action effects. However, in real-world problems action effects are often non-deterministic, and actions can have multiple possible effects with varying probabilities. In this paper, we introduce Stochastic cppp (scppp), which is an extension of cppp to domains with stochastic action effects. We show how scppp can be modeled as a Markov decision process (mdp) and how the value-iteration algorithm can be adapted to solve it. This adaptation requires extending value-iteration to support multiple agents and privacy. Then, we present two adaptions of the real-time dynamic programming (rtdp) algorithm, a popular algorithm for solving mdps, designed to solve scppp problems. The first rtdp adaptation, called distributed rtdp (drtdp), yields identical behavior to applying rtdp in a centralized manner on the joint problem. To preserve privacy, drtdp uses a message passing mechanism adopted from the mafs algorithm. The second rtdp adaptation is an approximation of drtdp called public synchronization rtdp (ps-rtdp). ps-rtdp differs from drtdp mainly in its message passing mechanism, where ps-rtdp sends significantly fewer messages than drtdp. We experimented on domains adapted from the deterministic cppp literature by adding different stochastic effects to different actions. The results show that ps-rtdp can reduce the amount of messages compared to drtdp by orders of magnitude thus improving run-time, while producing policies with similar expected costs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Privacy stochastic games in distributed constraint reasoning

Article 11 April 2019

Iterative Width Search for Multi Agent Privacy-Preserving Planning

A General Approach to Distributed and Privacy-Preserving Heuristic Computation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

References

Hefner, T., Stern, R., & Shani, G. (2020). Privacy preserving planning in stochastic environments. In J. C. Beck, O. Buffet, J. Hoffmann, E. Karpas, & S. Sohrabi, (Eds.). Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling, Nancy, France, October 26–30 (pp. 258–262). AAAI Press. https://aaai.org/ojs/index.php/ICAPS/article/view/6669
Brafman, R. I., & Domshlak, C. (2008). From one to many: Planning for loosely coupled multi-agent systems. In ICAPS (pp. 28–35).
Bellman, R. (1966). Dynamic programming. Science, 153(3731), 34–37.
Article Google Scholar
Kolobov, A. . (2012). Planning with Markov decision processes: An AI perspective. Synthesis Lectures on Artificial Intelligence and Machine Learning, 6(1), 1–210.
MATH Google Scholar
Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1–2), 81–138.
Article Google Scholar
Nissim, R., & Brafman, R. I. (2014). Distributed heuristic forward search for multi-agent planning. JAIR, 51, 293–332.
Article MathSciNet Google Scholar
Štolba, M., Komenda, A., & Kovacs, D. L. (2015). Competition of distributed and multiagent planners (codmap). In The International Planning Competition (WIPC-15), (p. 24).
Brafman, R. I., & Domshlak, C. (2013). On the complexity of planning for agent teams and its implications for single agent planning. Artificial Intelligence, 198, 52–71.
Article MathSciNet Google Scholar
Brafman, R. I. (2015). A privacy preserving algorithm for multi-agent planning and search. In IJCAI (pp. 1530–1536).
Tožička, J., Štolba, M., & Komenda, A. (2017). The limits of strong privacy preserving multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 27).
Maliah, S., Shani, G., & Stern, R. (2018). Action dependencies in privacy-preserving multi-agent planning. Autonomous Agents and Multi-Agent Systems, 32(6), 779–821.
Article Google Scholar
Maliah, S., Shani, G., & Brafman, R. I. (2016a). Online macro generation for privacy preserving planning. In Twenty-Sixth International Conference on Automated Planning and Scheduling.
Wu, F., Zilberstein, S., & Chen, X. (2018). Privacy-preserving policy iteration for decentralized pomdps. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA (pp. 4759–4766).
Štolba, M., Tožička, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. TOIT, 18(3), 28.
Article Google Scholar
Štolba, M., Fišer, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 482–490).
Guestrin, C., Koller, D., & Parr, R. (2001). Max-norm projections for factored mdps. InIn IJCAI (Vol. 17, pp. 673–680).
Bellman, R. (1957). A Markovian decision process. Journal of Mathematics and Mechanics, 6, 679–684.
MathSciNet MATH Google Scholar
Puterman, M. L. (2014). Markov decision processes: Discrete stochastic dynamic programming. Wiley.
MATH Google Scholar
Trevizan, F. W., Teichteil-Königsbuch, F., & Thiébaux, S. (2017). Efficient solutions for stochastic shortest path problems with dead ends. In UAI.
Du, Wenliang., & Zhan, Zhijun. (2002). A practical approach to solve secure multi-party computation problems. In Proceedings of the 2002 workshop on New security paradigms, pp 127–135.
Bertsekas, D. (1982). Distributed dynamic programming. IEEE Transactions on Automatic Control, 27(3), 610–616.
Article MathSciNet Google Scholar
Witzig, J., Beckenbach, I., Eifler, L., Fackeldey, K., Gleixner, A., Grever, A., & Weber, M. (2018). Mixed-integer programming for cycle detection in nonreversible Markov processes. Multiscale Modeling & Simulation, 16(1), 248–265.
Article MathSciNet Google Scholar
Ameloot, T. J., & Van den Bussche, J. (2015). On the convergence of cycle detection for navigational reinforcement learning. http://arxiv.org/1511.08724
Papadimitriou, C. H., & Tsitsiklis, J. N. (1987). The complexity of Markov decision processes. Mathematics of Operations Research, 12(3), 441–450.
Article MathSciNet Google Scholar
Bertsekas, D. P., & Tsitsiklis, J. N. (1991). An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3), 580–595.
Article MathSciNet Google Scholar
Littman, M. L., Dean, T. L., & Kaelbling, L. P. (2013). On the complexity of solving markov decision problems. http://arxiv.org/1302.4971
Štolba, M., & Komenda, A. (2017). The Madla planner: Multi-agent planning by combination of distributed and local heuristic search. Artificial Intelligence, 252, 175–210.
Article MathSciNet Google Scholar
Maliah, S., Shani, G., & Stern, R. (2016b). Stronger privacy preserving projections for multi-agent planning. In the International Conference on Automated Planning and Scheduling (ICAPS) (pp. 221–229).
Maliah, S., Shani, G., & Stern, R. (2016c). Stronger privacy preserving projections for multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 26).
Maliah, S., Shani, G., & Stern, R. (2014). Privacy preserving landmark detection. In the European Conference on Artificial Intelligence (ECAI) (pp. 597–602).
Hauskrecht, M., Meuleau, N., Kaelbling, L. P, Dean, T. L., & Boutilier, C. (2013). Hierarchical solution of markov decision processes using macro-actions. http://arxiv.org/1301.7381
Gerevini, A. E., Lipovetzky, N., Percassi, F., Saetti, A., & Serina, I. (2019). Best-first width search for multi agent privacy-preserving planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 163–171).
Gohari, P., Hale, M., & Topcu, U. (2020). Privacy-preserving policy synthesis in markov decision processes. In 2020 59th IEEE Conference on Decision and Control (CDC) (pp. 6266–6271). IEEE.
Venkitasubramaniam, P. (2013). Privacy in stochastic control: A markov decision process perspective. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 381–388). IEEE.
Boutilier, C. (1999). Sequential optimality and coordination in multiagent systems. In In IJCAI (Vol. 99, pp. 478–485).
Amato, C., Chowdhary, G., Geramifard, A., Kemal Üre, N., & Kochenderfer, M. J. (2013). Decentralized control of partially observable markov decision processes. In 52nd IEEE Conference on Decision and Control (pp. 2398–2405). IEEE.
Kocsis, L., Szepesvári, C. (2006). Bandit based monte-carlo planning. In European conference on machine learning (pp. 282–293). Springer.
Brendan, M. H., Likhachev, M., & Gordon, G. J. (2005). Bounded real-time dynamic programming: Rtdp with monotone upper bounds and performance guarantees. In Proceedings of the 22nd international conference on Machine learning (pp. 569–576). ACM.
Smith, T., & Simmons, R. (2006). Focused real-time dynamic programming for mdps: Squeezing more out of a heuristic. In AAAI (pp. 1227–1232).
Bonet, B., & Geffner, H. (2001). Planning as heuristic search. Artificial Intelligence, 129(1), 5–33.
Article MathSciNet Google Scholar
Keller, T., & Eyerich, P. (2012). Prost: Probabilistic planning based on uct. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 22).
Bonet, B., & Geffner, H. (2006). Learning depth-first search: A unified approach to heuristic search in deterministic and non-deterministic settings, and its application to mdps. In In ICAPS (Vol. 6, pp. 142–151).
Stolba, M., Tozicka, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. ACM Transactions on Internet Technology, 18(3), 28:1-28:21.
Article Google Scholar
Stolba, M., Fiser, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In J. Benton, N. Lipovetzky, E. Onaindia, D. E. Smith, & S. Srivastava (Eds.). Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, ICAPS 2018, Berkeley, CA, USA, July 11–15, 2019 (pp. 482–490). AAAI Press.

Download references

Acknowledgements

This research was supported by ISF Grant #210/17 to Roni Stern. This research was also supported by the ISF fund under Grant #1210/18.

Author information

Authors and Affiliations

Software and Information Engineering Department, Ben Gurion University of the Negev, Be’er Sheva, Israel
Tommy Hefner, Guy Shani & Roni Stern
Information Systems Laboratory, Palo Alto Research Center, Palo Alto, CA, USA
Roni Stern

Authors

Tommy Hefner
View author publications
You can also search for this author in PubMed Google Scholar
Guy Shani
View author publications
You can also search for this author in PubMed Google Scholar
Roni Stern
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tommy Hefner.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Parts of this paper appeared in [1].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hefner, T., Shani, G. & Stern, R. Privacy preserving planning in multi-agent stochastic environments. Auton Agent Multi-Agent Syst 36, 22 (2022). https://doi.org/10.1007/s10458-022-09554-w

Download citation

Accepted: 07 March 2022
Published: 26 March 2022
DOI: https://doi.org/10.1007/s10458-022-09554-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Privacy preserving planning in multi-agent stochastic environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy stochastic games in distributed constraint reasoning

Iterative Width Search for Multi Agent Privacy-Preserving Planning

A General Approach to Distributed and Privacy-Preserving Heuristic Computation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Privacy preserving planning in multi-agent stochastic environments

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy stochastic games in distributed constraint reasoning

Iterative Width Search for Multi Agent Privacy-Preserving Planning

A General Approach to Distributed and Privacy-Preserving Heuristic Computation

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation