Algorithms may not learn to play a unique Nash equilibrium | Journal of Computational Social Science Skip to main content
Log in

Algorithms may not learn to play a unique Nash equilibrium

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

There is a widespread hope that, in the near future, algorithms become so sophisticated that “solutions” to most problems are found by machines. In this note, we throw some doubts on this expectation by showing the following impossibility result: given a set of finite-memory, finite-iteration algorithms, a continuum of games exist, whose unique and strict Nash equilibrium cannot be reached from a large set of initial states. A Nash equilibrium is a social solution to conflicts of interest, and hence finite algorithms should not be always relied upon for social problems. Our result also shows how to construct games to deceive a given set of algorithms to be trapped in a cycle without a Nash equilibrium.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. It is well-known that Lemke–Howson algorithm [11] can find a Nash equilibrium of any finite game, although the time it takes can be exponential (Savani and Stengel [16]). Our motivation is completely different from this line of research.

  2. In order to always choose a pure-action best response, some tie-breaking rule must be added. This caveat applies to all behavior rules in the following, but our impossibility result is independent of the tie-breaking rules.

  3. The standard fictitious play rule uses the entire history to compute the “observed frequency” and thus requires unlimited memory. However, we can modify the definition of the “observed frequency” to allow bounded memory.

  4. The standard model of the level-k theory is for a single population model with a symmetric component game.

  5. Since we allow learning of how to choose a behavior rule, it is not a simple “learning process”.

  6. For an illustration of a rule-learning model, see Fig. 1 of Stahl [18].

  7. We can allow an S1-player to choose a mixed-action best response \(\sigma _i \in BR_i(a_j)\) for some \(a_j \in A_j \mid _h\). This, however, complicates the analysis without affecting our impossibility result.

References

  1. Basu, K., & Weibull, J. W. (1991). Strategy subsets closed under rational behavior. Economics Letters, 36(2), 141–146.

    Article  Google Scholar 

  2. Brown, G. W. (1951). Iterative solutions of games by fictitious play. In T. C, Koopmans (Ed.), Activity analysis of production and allocation (pp. 374–376). New York: Wiley.

  3. Foster, D., & Young, P. (2003). Learning, hypothesis testing, and Nash equilibrium. Games and Economic Behavior, 45(1), 73–96.

    Article  Google Scholar 

  4. Fudenberg, D., & Kreps, D. (1993). Learning mixed equilibria. Games and Economic Behavior, 5(3), 320–367.

    Article  Google Scholar 

  5. Fudenberg, D., & Takahashi, S. (2011). Heterogeneous beliefs and local information in stochastic fictitious play. Games and Economic Behavior, 71(1), 100–120.

    Article  Google Scholar 

  6. Gilli, M. (2001). A general approach to rational learning in games. Bulletin of Economic Research, 53(4), 275–303.

    Article  Google Scholar 

  7. Hurkens, S. (1995). Learning by forgetful players. Games and Economic Behavior, 11(2), 304–329.

    Article  Google Scholar 

  8. Kandori, M., Mailath, G., & Rob, R. (1993). Learning, mutation, and long run equilibria in games. Econometrica, 61(1), 29–56.

    Article  Google Scholar 

  9. Kneeland, T. (2015). Identifying higher-order rationality. Econometrica, 83(5), 2065–2079.

    Article  Google Scholar 

  10. Koh, S., Zhou, B., Fang, H., Yang, P., Yang, A., Yang, Q., et al. (2020). Real-time deep reinforcement learning based vehicle nagivation. Applied Soft Computing, 96, 106694. https://doi.org/10.1016/j.asoc.2020.106694.

    Article  Google Scholar 

  11. Lemke, C., & Howson, J. (1964). Equilibrium points of bimatrix games. Journal of the Society for Industrial and Applied Mathematics, 12(2), 413–423.

    Article  Google Scholar 

  12. Matros, A. (2003). Clever agents in adaptive learning. Journal of Economic Theory, 111(1), 110–124.

    Article  Google Scholar 

  13. Milgrom, P., & Roberts, D. J. (1991). Adaptive and sophisticated learning in normal form games. Games and Economic Behavior, 3(1), 82–100.

    Article  Google Scholar 

  14. Mohlin, E. (2012). Evolution of theories of mind. Games and Economic Behavior, 75(1), 299–318.

    Article  Google Scholar 

  15. Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economic Review, 85(5), 1313–1326.

    Google Scholar 

  16. Savani, R., & von Stengel, B. (2006). Hard-to-solve bimatrix games. Econometrica, 74(2), 397–429.

    Article  Google Scholar 

  17. Selten, R. (1991). Anticipatory learning in 2 person games. In R. Selten (Ed.), Game equilibrium models I (pp. 98–154). Berlin: Springer Verlag.

    Chapter  Google Scholar 

  18. Stahl, D. (1996). Boundedly rational rule learning in a guessing game. Games and Economic Behavior, 16(2), 303–330.

    Article  Google Scholar 

  19. Stahl, D. (2000). Rule learning in symmetric normal-form games: Theory and evidence. Games and Economic Behavior, 32(1), 105–138.

    Article  Google Scholar 

  20. Young, H. P. (1993). The evolution of conventions. Econometrica, 61(1), 57–84.

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to an anonymous referee for useful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takako Fujiwara-Greve.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fujiwara-Greve, T., Nielsen, C.K. Algorithms may not learn to play a unique Nash equilibrium. J Comput Soc Sc 4, 839–850 (2021). https://doi.org/10.1007/s42001-021-00109-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-021-00109-9

Keywords

Navigation