Algorithms may not learn to play a unique Nash equilibrium

Fujiwara-Greve, Takako; Nielsen, Carsten Krabbe

doi:10.1007/s42001-021-00109-9

Algorithms may not learn to play a unique Nash equilibrium

Research Article
Published: 14 March 2021

Volume 4, pages 839–850, (2021)
Cite this article

Journal of Computational Social Science Aims and scope Submit manuscript

281 Accesses
Explore all metrics

Abstract

There is a widespread hope that, in the near future, algorithms become so sophisticated that “solutions” to most problems are found by machines. In this note, we throw some doubts on this expectation by showing the following impossibility result: given a set of finite-memory, finite-iteration algorithms, a continuum of games exist, whose unique and strict Nash equilibrium cannot be reached from a large set of initial states. A Nash equilibrium is a social solution to conflicts of interest, and hence finite algorithms should not be always relied upon for social problems. Our result also shows how to construct games to deceive a given set of algorithms to be trapped in a cycle without a Nash equilibrium.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A unified stochastic approximation framework for learning in games

Article 04 August 2023

Subgame perfection in recursive perfect information games

Article Open access 01 April 2020

Dynamic Noncooperative Games

Notes

It is well-known that Lemke–Howson algorithm [11] can find a Nash equilibrium of any finite game, although the time it takes can be exponential (Savani and Stengel [16]). Our motivation is completely different from this line of research.
In order to always choose a pure-action best response, some tie-breaking rule must be added. This caveat applies to all behavior rules in the following, but our impossibility result is independent of the tie-breaking rules.
The standard fictitious play rule uses the entire history to compute the “observed frequency” and thus requires unlimited memory. However, we can modify the definition of the “observed frequency” to allow bounded memory.
The standard model of the level-k theory is for a single population model with a symmetric component game.
Since we allow learning of how to choose a behavior rule, it is not a simple “learning process”.
For an illustration of a rule-learning model, see Fig. 1 of Stahl [18].
We can allow an S1-player to choose a mixed-action best response \(\sigma _i \in BR_i(a_j)\) for some \(a_j \in A_j \mid _h\). This, however, complicates the analysis without affecting our impossibility result.

References

Basu, K., & Weibull, J. W. (1991). Strategy subsets closed under rational behavior. Economics Letters, 36(2), 141–146.
Article Google Scholar
Brown, G. W. (1951). Iterative solutions of games by fictitious play. In T. C, Koopmans (Ed.), Activity analysis of production and allocation (pp. 374–376). New York: Wiley.
Foster, D., & Young, P. (2003). Learning, hypothesis testing, and Nash equilibrium. Games and Economic Behavior, 45(1), 73–96.
Article Google Scholar
Fudenberg, D., & Kreps, D. (1993). Learning mixed equilibria. Games and Economic Behavior, 5(3), 320–367.
Article Google Scholar
Fudenberg, D., & Takahashi, S. (2011). Heterogeneous beliefs and local information in stochastic fictitious play. Games and Economic Behavior, 71(1), 100–120.
Article Google Scholar
Gilli, M. (2001). A general approach to rational learning in games. Bulletin of Economic Research, 53(4), 275–303.
Article Google Scholar
Hurkens, S. (1995). Learning by forgetful players. Games and Economic Behavior, 11(2), 304–329.
Article Google Scholar
Kandori, M., Mailath, G., & Rob, R. (1993). Learning, mutation, and long run equilibria in games. Econometrica, 61(1), 29–56.
Article Google Scholar
Kneeland, T. (2015). Identifying higher-order rationality. Econometrica, 83(5), 2065–2079.
Article Google Scholar
Koh, S., Zhou, B., Fang, H., Yang, P., Yang, A., Yang, Q., et al. (2020). Real-time deep reinforcement learning based vehicle nagivation. Applied Soft Computing, 96, 106694. https://doi.org/10.1016/j.asoc.2020.106694.
Article Google Scholar
Lemke, C., & Howson, J. (1964). Equilibrium points of bimatrix games. Journal of the Society for Industrial and Applied Mathematics, 12(2), 413–423.
Article Google Scholar
Matros, A. (2003). Clever agents in adaptive learning. Journal of Economic Theory, 111(1), 110–124.
Article Google Scholar
Milgrom, P., & Roberts, D. J. (1991). Adaptive and sophisticated learning in normal form games. Games and Economic Behavior, 3(1), 82–100.
Article Google Scholar
Mohlin, E. (2012). Evolution of theories of mind. Games and Economic Behavior, 75(1), 299–318.
Article Google Scholar
Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economic Review, 85(5), 1313–1326.
Google Scholar
Savani, R., & von Stengel, B. (2006). Hard-to-solve bimatrix games. Econometrica, 74(2), 397–429.
Article Google Scholar
Selten, R. (1991). Anticipatory learning in 2 person games. In R. Selten (Ed.), Game equilibrium models I (pp. 98–154). Berlin: Springer Verlag.
Chapter Google Scholar
Stahl, D. (1996). Boundedly rational rule learning in a guessing game. Games and Economic Behavior, 16(2), 303–330.
Article Google Scholar
Stahl, D. (2000). Rule learning in symmetric normal-form games: Theory and evidence. Games and Economic Behavior, 32(1), 105–138.
Article Google Scholar
Young, H. P. (1993). The evolution of conventions. Econometrica, 61(1), 57–84.
Article Google Scholar

Download references

Acknowledgements

The authors are grateful to an anonymous referee for useful comments.

Author information

Authors and Affiliations

Department of Economics, Keio University, 2-15-45 Mita, Minato-ku, Tokyo, 108-8345, Japan
Takako Fujiwara-Greve
Dipartimento di Economia e Finanza, Catholic University of Milan, Via Necchi 5, 20123, Milan, Italy
Carsten Krabbe Nielsen

Authors

Takako Fujiwara-Greve
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Krabbe Nielsen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takako Fujiwara-Greve.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fujiwara-Greve, T., Nielsen, C.K. Algorithms may not learn to play a unique Nash equilibrium. J Comput Soc Sc 4, 839–850 (2021). https://doi.org/10.1007/s42001-021-00109-9

Download citation

Received: 25 March 2020
Accepted: 01 March 2021
Published: 14 March 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s42001-021-00109-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Algorithms may not learn to play a unique Nash equilibrium

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A unified stochastic approximation framework for learning in games

Subgame perfection in recursive perfect information games

Dynamic Noncooperative Games

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Algorithms may not learn to play a unique Nash equilibrium

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A unified stochastic approximation framework for learning in games

Subgame perfection in recursive perfect information games

Dynamic Noncooperative Games

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation