Solving efficiently Decentralized MDPs with temporal and resource constraints

Beynier, Aurélie; Mouaddib, Abdel-Illah

doi:10.1007/s10458-010-9145-2

Solving efficiently Decentralized MDPs with temporal and resource constraints

Published: 14 July 2010

Volume 23, pages 486–539, (2011)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Aurélie Beynier¹ &
Abdel-Illah Mouaddib²

216 Accesses
7 Citations
Explore all metrics

Abstract

Optimizing the operation of cooperative multi-agent systems that can deal with large and realistic problems has become an important focal area of research in the multi-agent community. In this paper, we first present a new model, the OC-DEC-MDP (Opportunity Cost Decentralized Markov Decision Process), that allows us to represent large multi-agent decision problems with temporal and precedence constraints. Then, we propose polynomial algorithms to efficiently solve problems formalized by OC-DEC-MDPs. The problems we deal with consist of a set of agents that have to execute a set of tasks in a cooperative way. The agents cannot communicate during task execution and they must respect resource and temporal constraints. Our approach is based on Decentralized Markov Decision Processes (DEC-MDPs) and uses the concept of opportunity cost borrowed from economics to obtain approximate control policies. Experimental results show that our approach produces good quality solutions for complex problems which are out of reach of existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Privacy preserving planning in multi-agent stochastic environments

Article 26 March 2022

DITURIA: A Framework for Decision Coordination Among Multiple Agents

An extended version of opportunity cost algorithm for communication decisions

Article 23 September 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abdallah, S., & Lesser, V. (2005). Modeling task allocation using a decision theoretic model. In Proceedings of fourth international joint conference on autonomous agents and multiagent systems (pp. 719–726). Utrecht, Netherlands: ACM Press.
Amato, C., Bernstein D.S., & Zilberstein, S. (2007). Optimizing memory-bounded controllers for decentralized POMDPs. In Proceedings of the twenty third conference on uncertainty in artificial intelligence.
Amato, C., Carlin, A., & Zilberstein, S. (2007). Bounded dynamic programming for decetralized POMDPs. In AAMAS 2007 workshop on multi-agent sequential decision making in uncertain domains.
Becker, R., Lesser, V., & Zilberstein, S. (2004). Decentralized Markov decision processes with event-driven interactions. In The third international joint conference on autonomous agents and multi agent systems (Vol. 1, pp. 302–309). New York: IEEE Computer Society.
Becker, R., Lesser, V., & Zilberstein, S. (2005). Analyzing myopic approaches for multi-agent communication. In Proceedings of the 2005 IEEE/WIC/ACM international conference on intelligent agent technology (IAT 05), September 2005 (pp. 550–557). Compiegne, France: IEEE Computer Society.
Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. (2003). Transition-independent decentralized markov decision processes. In Proceedings of the second international joint conference on autonomous agents and multi agent systems, July 2003 (pp. 41–48). Melbourne, Australia.
Becker R., Zilberstein S., Lesser V., Goldman C. (2004) Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research 22: 423–455
MathSciNet MATH Google Scholar
Bernstein, D., Hansen, E. A., & Zilberstein, S. (2005). Bounded policy iteration for decentralized POMDPs. In Proceedings of the nineteenth international joint conference on artificial intelligence. Edinburgh, Scotland.
Bernstein D., Zilberstein S., Immerman N. (2002) The complexity of decentralized control of MDPs. Mathematics of Operations Research 27(4): 819–840
Article MathSciNet MATH Google Scholar
Bernstein, D., Zilberstein, S., Washington, R., & Bresina, J. (2001). Planetary rover control as a Markov decision process. In The 6th international symposium on artificial intelligence, robotics and automation in space. Montreal, Canada.
Blum A., Furst M. (1997) Fast planning through planning graph analysis. Artificial Intelligence 90: 281–300
Article MATH Google Scholar
Blum, A., & Langford, J. (1999). Probabilistic planning in the graphplan framework. In Proceedings of ECP’99.
Blythe, J. (1999). Decision-theoretic planning. AI Magazine.
Blythe, J. (1999). Planning under uncertainty in dynamic domains. PhD thesis, Carnegie Mellon University.
Boutilier, C., Brafman, R., & Geib, C. (1997). Prioritized goal decomposition of Markov decision processes: Towards a synthesis of classical and decision theoretic planning. In Proceedings of the fifteenth international joint conference on artificial intelligence (pp. 1156–1163). San Francisco: Morgan Kaufmann.
Boutilier C., Dean T., Hanks S. (1999) Decision-theoretic planning: Structural asumptions and computational leverage. Journal of Articicial Intelligence Research 1: 1–93
MathSciNet Google Scholar
Bresina, J., Dearden, R., Meuleau, N., Ramakrishnan, S., Smith, D., & Washington, R. (2002). Planning under continuous time and resource uncertainty: A challenge for AI. In UAI.
Chadès, I., Scherrer, B., & Charpillet, F. (2002). A heuristic approach for solving decentralized-POMDP: Assessment on the pursuit problem. In Proceedings of the sixteenth acm symposium on applied computing.
Dean, T., & Lin, S. (1995). Decomposition techniques for planning in stochastic domains. In IJCAI-95.
Decker K., Lesser V. (1993) Quantitative modeling of complex environments. International Journal of Intelligent Systems in Accounting Finance and Management 2(4): 215–234
Google Scholar
Emery-Montemerlo, R., Gordon, G., Schneider, J., & Thrun, S. (2004). Approximate solutions for partially observable stochastic games with common payoffs. In Proceedings of the third joint conference on autnomous agents and multi agent systems.
Esben, H. O., Maja, J. M., & Gaurav, S. S. (2002). Multi-robot task allocation in the light of uncertainty. In Proceedings of IEEE international conference on robotics and automation (pp. 3002–3007).
Gerkey B. P., Matarić M. J. (2002) Sold!: Auction methods for multi-robot coordination. IEEE Transactions on Robotics and Automation 18(5): 58–768
Article Google Scholar
Goldman C., Zilberstein S. (2004) Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research 22: 143–174
MathSciNet MATH Google Scholar
Hanna, H., & Mouaddib, A. (2002). Task selection as decision making in multiagent system. In International joint conference on autonomous agents and multi agent systems (pp. 616–623).
Hansen, E. A., Bernstein, D., & Zilberstein, S. (2004). Dynamic programming for partially observable stochastic games. In Proceedings of the nineteenth national conference on artificial intelligence.
Howard R. A. (1960) Dynamic programming and Markov processes. MIT Press, Cambridge, MA
MATH Google Scholar
Koller D., Milch B. (2003) Multi-agent influence diagrams for representing and solving games. Games and Economic Behavior 45(1): 181–221
Article MathSciNet MATH Google Scholar
Maheswaran, R., Szekely, P., Becker, M., Fitzpatrick, S., Gati, G., Jin, J., Neches, R., Noori, N., Rogers, C., Sanchez, R., Smyth, K., & Vanbuskirk, C. (2008). Predictability and criticality metrics for coordination in complex environments. In Proceedings of the 7th international joint conference on autonomous agents and multiagent systems.
Marecki, J., & Tambe, M. (2007). On opportunistic techniques for solving decentralized MDPs with temporal constraints. In Proceedings of the sixth international joint conference on autonomous agents and multi-agent systems (AAMAS).
Meuleau, N., Hauskrecht, M., Kim, K.-E., Peshkin, L., Kaelbling, L., Dean, T., & Boutilier, C. (1998). Solving very large weakly coupled markov decision processes. In: AAAI/IAAI (pp. 165–172).
Morimoto, T. (2000). How to develop a RoboCupRescue agent. RoboCupRescue Technical Committee.
Musliner, D., Durfee, E., Wu, J., Dolgiv, D., & Boddy, M. (2007). Coordination of highly contingent plans. In Proceedings of the international conference on knowledge intensive multi-agent systems (KIMAS) (pp. 418–422).
Nair, R., Pradeep, V., Milind, T., & Makoto, Y. (2005). Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proceedings of the twentieth national conference on artificial intelligence (AAAI-05).
Nair, R., Roth, M., Yokoo, M., & Tambe, M. (2004). Communication for improving policy computation in distributed POMDPs. In Proceedings of the third international joint conference on agents and multiagent systems (AAMAS-04) (pp. 1098–1105).
Nair, R., Tambe, M., Yokoo, M., Marsella, S., & Pynadath, D. V. (2003). Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proceedings of the international joint conference on artificial intelligence (pp. 705–711).
Peshkin, L., Kim, K., Meuleu, N., & Kaelbling, L. (2000). Learning to cooperate via policy search. In Sixteenth conference on uncertainty in artificial intelligence (pp. 307–314).
Poupart, P., Boutilier, C., Patrascu, R., & Schuurmans, D. (2002). Piecewise linear value function approximation for factored MDPs. In Eighteenth national conference on artificial intelligence. Edmonton.
Puterman M. L. (2005) Markov decision processes: Discrete stochastic dynamic programming. Wiley-Interscience, New York
MATH Google Scholar
Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the 38th annual meeting of the association for computational linguistics (ACL-2000). Hong Kong.
Seuken, S., & Zilberstein, S. (2007). Memory-bounded dynamic programming for DEC-POMDPs. In Proceedings of the twentieth international joint conference on artificial intelligence (IJCAI) (pp. 2009–2015).
Shen, J., Becker, R., & Lesser, V. (2006). Agent interaction in distributed MDPs and its implications on complexity. In Proceedings of the fifth international joint conference on autonomous agents and multi-agent systems (pp. 529–536).
Singh, S., & Cohn, D. (1998). How to dynamically merge Markov decision processes. In Advances in neural information processing systems (Vol. 10). Cambridge, MA: The MIT Press.
Smith, S. F., Gallagher, A., & Zimmerman, T. (2007). Distributed management of flexible times schedules. In AAMAS ’07: Proceedings of the 6th international joint conference on autonomous agents and multiagent systems. ACM.
Szer, D., Charpillet, F., & Zilberstein, S. (2005). MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proceedings of the 21st conference on uncertainty in artificial intelligence.
The RoboCup Rescue Technical Committee. (2000). RoboCup-Rescue simulator manual. The RoboCup Federation.
Varakantham, P., Marecki, J., Yabu, Y., Milind, T., & Makoto, Y. (2007). Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies. In Proceedings of the international joint conference on agents and multiagent systems (AAMAS-07).
Wieser, F. (1889). Valeur naturelle (Der natürliche Wert).
Wu, F., Zilberstein, S., & Chen, X. (2010). Point-based policy generation for decentralized POMDPs. In Proceedings of the ninth international joint conference on autonomous agents and multi-agent systems.

Download references

Author information

Authors and Affiliations

LIP6, University Pierre and Marie Curie, 4 place Jussieu, 75005, Paris, France
Aurélie Beynier
GREYC, University of Caen, Bd. Maréchal Juin, 14032, Caen Cedex, France
Abdel-Illah Mouaddib

Authors

Aurélie Beynier
View author publications
You can also search for this author in PubMed Google Scholar
Abdel-Illah Mouaddib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aurélie Beynier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beynier, A., Mouaddib, AI. Solving efficiently Decentralized MDPs with temporal and resource constraints. Auton Agent Multi-Agent Syst 23, 486–539 (2011). https://doi.org/10.1007/s10458-010-9145-2

Download citation

Published: 14 July 2010
Issue Date: November 2011
DOI: https://doi.org/10.1007/s10458-010-9145-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Solving efficiently Decentralized MDPs with temporal and resource constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy preserving planning in multi-agent stochastic environments

DITURIA: A Framework for Decision Coordination Among Multiple Agents

An extended version of opportunity cost algorithm for communication decisions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Solving efficiently Decentralized MDPs with temporal and resource constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Privacy preserving planning in multi-agent stochastic environments

DITURIA: A Framework for Decision Coordination Among Multiple Agents

An extended version of opportunity cost algorithm for communication decisions

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation