Abstract
As part of the DARPA SocialSim challenge, we address the problem of predicting behavioral phenomena including information spread involving hundreds of thousands of users across three major linked social networks: Twitter, Reddit and GitHub. Our approach develops a framework for data-driven agent simulation that begins with a discrete-event simulation of the environment populated with generic, flexible agents, then optimizes the decision model of the agents by combining a number of machine learning classification problems. The ML problems predict when an agent will take a certain action in its world and are designed to combine aspects of the agents, gathered from historical data, with dynamic aspects of the environment including the resources, such as tweets, that agents interact with at a given point in time. In this way, each of the agents makes individualized decisions based on their environment, neighbors and history during the simulation, although global simulation data is used to learn accurate generalizations. This approach showed the best performance of all participants in the DARPA challenge across a broad range of metrics. We describe the performance of models both with and without machine learning on measures of cross-platform information spread defined both at the level of the whole population and at the community level. The best performing model overall combines learned agent behaviors with explicit modeling of bursts in global activity. Because of the general nature of our approach, it is applicable to a range of prediction problems that require modeling individualized, situational agent behavior from trace data that combines many agents.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Tweet object: https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/object-model/tweet.
Reddit API: https://www.reddit.com/dev/api/.
GitHub commit object: https://docs.github.com/en/rest/reference/commits.
GitHub user object: https://docs.github.com/en/rest/reference/users.
GitHub repository object: https://docs.github.com/en/rest/reference/repos.
References
Aho, T., Ženko, B., Džeroski, S., & Elomaa, T. (2012). Multi-target regression with rule ensembles. Journal of Machine Learning Research, 13(Aug), 2367–2407.
Appice, A., & Džeroski, S. (2007). Stepwise induction of multi-target model trees. In Machine learning: ECML 2007 (pp. 502–509). Berlin: Springer. https://doi.org/10.1007/978-3-540-74958-5
Backstrom, L., & Leskovec, J. (2011). Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the fourth ACM international conference on web search and data mining, WSDM ’11 (pp. 635–644). New York: ACM. https://doi.org/10.1145/1935826.1935914.
Bergenti, F., Franchi, E., & Poggi, A. (2011). Selected models for agent-based simulation of social networks. In 3rd Symposium on social networks and multiagent systems (SNAMAS 2011), pp. 27–32.
Bergenti, F., Franchi, E., & Poggi, A. (2013). Agent-based interpretations of classic network models. Computational and Mathematical Organization Theory, 19(2), 105–127.
Blockeel, H., De Raedt, L., & Ramon, J. (2000). Top-down induction of clustering trees. arXiv preprint arXiv:cs/0011032
Blythe, J. (2012). A dual-process cognitive model for testing resilient control systems. In 2012 5th international symposium on resilient control systems (pp. 8–12). IEEE. https://doi.org/10.1109/ISRCS.2012.6309285
Blythe, J., Ferrara, E., Huang, D., Lerman, K., Muric, G., Sapienza, A., Tregubov, A., Pacheco, D., Bollenbacher, J., & Flammini, A., et al. (2019). The darpa socialsim challenge: Massive multi-agent simulations of the github ecosystem. In Proceedings of the 18th international conference on autonomous agents and Multiagent systems (pp. 1835–1837). International Foundation for Autonomous Agents and Multiagent Systems.
Blythe, J., & Tregubov, A. (2018). Farm: Architecture for distributed agent-based social simulations. In D. Lin, T. Ishida, F. Zambonelli, & I. Noda (Eds.), Massively multi-agent systems II (pp. 96–107). Springer International Publishing.
Blythe, J., & Tregubov, A. (2019). FARM: Architecture for distributed agent-based social simulations. In International workshop on massively multiagent systems (pp. 96–107). Cham: Springer. https://doi.org/10.1007/978-3-030-20937-7
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics.
Borchani, H., Varando, G., Bielza, C., & Larrañaga, P. (2015). A survey on multi-output regression. WIREs Data Mining Knowl Discov, 5, 216–233. https://doi.org/10.1002/widm.1157
Can, E.F., Oktay, H., & Manmatha, R. (2013). Predicting retweet count using visual cues. In Proceedings of the 22nd ACM international conference information & knowledge management—CIKM ’13 (pp. 1481–1484). New York: ACM Press. https://doi.org/10.1145/2505515.2507824
Collier, N., & North, M. (2013). Parallel agent-based simulation with repast for high performance computing. Simulation, 89(10), 1215–1235.
D O’Brien, J., Dassios, I. .K., & Gleeson, J. .P. (2019). Spreading of memes on multiplex networks. New Journal of Physics, 21(2), 025001.
Deissenberg, C., Van Der Hoog, S., & Dawid, H. (2008). Eurace: A massively parallel agent-based model of the European economy. Applied Mathematics and Computation, 204(2), 541–552.
Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems. https://doi.org/10.1016/j.knosys.2018.03.022
Jung, A.K., Mirbabaie, M., Ross, B., Stieglitz, S., Neuberger, C., & Kapidzic, S. (2018). Information diffusion between twitter and online media. In Proceedings of the thirty ninth international conference on information systems.
Kazemi, S. M., & Poole, D. (2018). Simple embedding for link prediction in knowledge graphs. In Advances in neural information processing systems (Vol. 2018-December, pp. 4284–4295). Neural Information Processing Systems Foundation.
Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4), 373–397.
Krijestorac, H., Garg, R., & Mahajan, V. (2019). Cross-platform spillover effects in consumption of viral content: A quasi-experimental analysis using synthetic controls. Available at SSRN 3011533
Linyuan, L. L., & Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications. https://doi.org/10.1016/j.physa.2010.11.027.
Mordelet, F., & Vert, J. P. (2014). A bagging SVM to learn from positive and unlabeled examples. Pattern Recognition Letters, 37, 201–209. https://doi.org/10.1016/J.PATREC.2013.06.010
Murić, G., Tregubov, A., Blythe, J., Abeliuk, A., Choudhary, D., Lerman, K.,&Ferrara, E. (2020). Massive cross-platform simulations of online social networks. In 19th international conference on autonomous agents and multiagent systems (AAMAS).
Similä, T., & Tikka, J. (2007). Input selection and shrinkage in multiresponse linear regression. Computational Statistics & Data Analysis, 52, 406–422.
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). LINE: Large-scale information network embedding. In WWW 2015—Proceedings of the 24th international conference on world wide web (pp. 1067–1077). Association for Computing Machinery, Inc . https://doi.org/10.1145/2736277.2741093
Wang, P., Xu, B., Wu, Y., & Zhou, X. (2015). Link prediction in social networks: The state-of-the-art. Science China Information Sciences, 58(1), 1–38. https://doi.org/10.1007/s11432-014-5237-y.
Zaman, T. R., Herbrich, R., Van Gael, J., & Stern, D. (2010). Predicting information spreading in twitter. In: Workshop on computational social science and the wisdom of crowds (Vol. 104, pp. 17599–601) . NIPS.
Zhang, Q., Gong, Y., Wu, J., Huang, H., & Huang, X. (2016). Retweet prediction with attention-based deep neural network. In Proceedings of the 25th ACM international on conference on information and knowledge management—CIKM ’16 (pp. 75–84). New York: ACM Press. https://doi.org/10.1145/2983323.2983809
Acknowledgements
The authors are grateful to the Defense Advanced Research Projects Agency (DARPA), contract W911NF-17-C-0094, for their support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Murić, G., Tregubov, A., Blythe, J. et al. Large-scale agent-based simulations of online social networks. Auton Agent Multi-Agent Syst 36, 38 (2022). https://doi.org/10.1007/s10458-022-09565-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10458-022-09565-7