Abstract
Controversy encompasses content that draws diverse perspectives, along with positive and negative feedback on a specific event, resulting in the formation of distinct user communities. we explore the explainability of controversy through the lens of SHAP (SHapley Additive exPlanations) method, aiming to provide a fair assessment of the individual contributions of different text features of tweets to controversy detection. We conduct an analysis of topic discussions on Twitter from a community perspective, investigating the role of text in accurately classifying tweets into their respective communities. To achieve this, we introduce a SHAP-based pipeline designed to quantify the influence of impactful text features on the predictions of three tweet classifiers. Text content alone offers interesting controversy detection accuracy. It can contain predictive features for controversy detection. For instance, negative connotations, pejorative tendencies and positive qualifying adjectives tend to impact the controversy model detection.
Similar content being viewed by others
Data Availability
No datasets were generated or analysed during the current study.
Notes
a “perfect” feature would represent 2 well-separated clusters of colors, far away from the decision boundary.
References
Garimella, K., et al.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3–1327 (2018). https://doi.org/10.1145/3140565
Hessel, J., Lee, L.: Something’s brewing! early prediction of controversy-causing posts from discussion features. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 1648–1659 (2019)
Jacomy, M., et al.: Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS one journal 9 (2014)
Benslimane, S., et al.: Controversy detection: A text and graph neural network based approach. In: 22nd Conference on Web Information Systems Engineering, 13080, pp. 339–354 (2021)
Zarate, J.M.O., et al.: Measuring controversy in social networks through NLP. In: 27th International Symposium on String Processing and Information Retrieval, SPIRE, Orlando, USA, October 13-15, 2020, 12303, pp. 194–209 (2020)
Iqbal, K., Khan, M.S.: Email classification analysis using machine learning techniques. Appl. Computing Inform. (2022). https://doi.org/10.1108/ACI-01-2022-0012
Levy, R., et al: Context dependent claim detection. In: 25th International Conference on Computational Linguistics: Technical Papers, pp. 1489–1500
Boyd, R., et al.: The development and psychometric properties of liwc-22 (2022)
Nakov, P., et al.: Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets (2022)
Preoţiuc-Pietro, D., et al.: Automatically identifying complaints in social media. In: ACM Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5008–5019 (2019). https://doi.org/10.18653/v1/P19-1495
Koncar, P., et al.: Analysis and prediction of multilingual controversy on reddit. In: Web Science Conference 2021, pp. 215–224 (2021)
Lundberg, S., Lee, S.-I.: A Unified Approach to Interpreting Model Predictions (2017)
Gongane, V.U., et al.: A survey of explainable AI techniques for detection of fake news and hate speech on social media platforms https://doi.org/10.1007/s42001-024-00248-9
Kozik, R., et al.: When explainability turns into a threat - using xai to fool a fake news detection method. Comput. Secur. 137, (2024). https://doi.org/10.1016/j.cose.2023.103599
Gómez-Suta, M., et al.: Stance detection in tweets: A topic modeling approach supporting explainability. Expert. Syst. Appl. 214, (2023). https://doi.org/10.1016/j.eswa.2022.119046
Yang, C., et al.: Efficient shapley values estimation by amortization for text classification. In: Rogers, A., Boyd-Graber, J.L., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pp. 8666–8680 (2023)
Liu, Y., et al.: Diagnosis of parkinson’s disease based on shap value feature selection. Biocybernetics Biomed. Eng. 42(3), 856–869 (2022). https://doi.org/10.1016/j.bbe.2022.06.007
Zavorotnyuk, D.S., et al.: Shapley value as a quality control for mass spectra of human glioblastoma tissues. Data 8(1), 21 (2023). https://doi.org/10.3390/DATA8010021
Garimella, K., et al: Exploring controversy in twitter. CoRR abs/1512.05550 (2015)
Garimella, K., et al.: Reducing controversy by connecting opposing views. In: Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, pp. 5249–5253 (2018)
Emamgholizadeh, H., et al.: A framework for quantifying controversy of social network debates using attributed networks: biased random walk (BRW). Soc. Netw. Anal. Min. 10(1), 90 (2020)
Guerra, P.H.C., et al.: A measure of polarization on social media networks based on community boundaries. In: Seventh International Conference on Weblogs and Social Media, ICWSM (2013)
Mendoza, M., et al.: GENE: graph generation conditioned on named entities for polarity and controversy detection in social media. Inf. Process. Manag. 57(6), 102366 (2020)
Zarate, J.M.O.D., Feuerstein, E.: Vocabulary-based method for quantifying controversy in social media. In: 25th International Conference on Conceptual Structures, ICCS, Springer, vol. 12277, pp. 161–176 (2020)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT Conference: Human Language Technologies, Volume 1, pp. 4171–4186 (2019)
Zhong, L., et al.: Integrating semantic and structural information with graph convolutional network for controversy detection. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL. Association for Computational Linguistics, ??? pp. 515–526 (2020)
Zhang, S., Xie, L.: Improving attention mechanism in graph neural networks via cardinality preservation. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 1395–1402 (2020)
Jang, M., Allan, J.: Explaining controversy on social media via stance summarization. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, pp. 1221–1224 (2018)
Guo, J.,: Expert-guided contrastive opinion summarization for controversial issues. In: Proceedings of the 24th ACM International Conference on World Wide Web. WWW ’15 Companion, pp. 1105–1110 (2015)
Coletto, M., et al.: Automatic controversy detection in social media: A content-independent motif-based approach. Online Soc. Netw. Media 3–4, 22–31 (2017)
Karypis, G., Kumar, V.: Metis – unstructured graph partitioning and sparse matrix ordering system, version 2.0 (1995)
Akoglu, H.: User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine 18 (2018). https://doi.org/10.1016/j.tjem.2018.08.001
Field, A.: Discovering Statistics Using SPSS (and Sex and Drugs and Rock ‘n’ Roll), (2013)
Marzjarani, M.: Sample size and outliers, leverage, and influential points, and cooks distance formula. (2015). https://api.semanticscholar.org/CorpusID:55026567
James, G., et al.: An Introduction to Statistical Learning: with Applications in R. Springer, ??? (2013). https://faculty.marshall.usc.edu/gareth-james/ISL/
Jamra, H.A., et al.: Identification of weak signals in a temporal graph of social interactions. In: IDEAS’22: International Database Engineered Applications Symposium, Budapest, Hungary, August 22 - 24, 2022, pp. 34–42 (2022)
Almarzouqi, A., et al.: Prediction of user’s intention to use metaverse system in medical education: A hybrid sem-ml learning approach. IEEE access 10, 43421–43434 (2022)
Mohapatra, A., et al.: Fake news detection and classification using hybrid bilstm and self-attention model. Multimedia Tools Appl. 81(13), 18503–18519 (2022)
Swathi, T., et al.: An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis. Appl. Intell. 52(12), 13675–13688 (2022)
Akbiyik, M.E., et al.: Ask" who", not" what": Bitcoin volatility forecasting with twitter data. In: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pp. 688–696 (2023)
Masud, R., et al.: Forecasting political parties and candidates for indonesia’s presidential election in 2024 using twitter (2023)
Zhang, Q., et al.: Neighborhood skyline on graphs: Concepts, algorithms and applications. In: 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3-7, 2023, pp. 585–598 (2023)
Funding
This work was supported by grants from Janssen Horizon endowment fund.
Author information
Authors and Affiliations
Contributions
Thomas Papastergiou worked on the statistical part, and wrote sections 3.4.1 and 5.4.1. Samy Benslimane worked and wrote the remaining of the manuscript text. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical Approval
This declaration is “not applicable”.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: MEDES-IDEAS 2023
Guest Editors: Joe Tekli, Djamal Benslimane, Richard Chbeir, Yannis Manolopoulos and Ngoc-Thanh Nguyen
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Benslimane, S., Papastergiou, T., Azé, J. et al. A SHAP-based controversy analysis through communities on Twitter. World Wide Web 27, 65 (2024). https://doi.org/10.1007/s11280-024-01278-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11280-024-01278-z