A SHAP-based controversy analysis through communities on Twitter | World Wide Web Skip to main content
Log in

A SHAP-based controversy analysis through communities on Twitter

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Controversy encompasses content that draws diverse perspectives, along with positive and negative feedback on a specific event, resulting in the formation of distinct user communities. we explore the explainability of controversy through the lens of SHAP (SHapley Additive exPlanations) method, aiming to provide a fair assessment of the individual contributions of different text features of tweets to controversy detection. We conduct an analysis of topic discussions on Twitter from a community perspective, investigating the role of text in accurately classifying tweets into their respective communities. To achieve this, we introduce a SHAP-based pipeline designed to quantify the influence of impactful text features on the predictions of three tweet classifiers. Text content alone offers interesting controversy detection accuracy. It can contain predictive features for controversy detection. For instance, negative connotations, pejorative tendencies and positive qualifying adjectives tend to impact the controversy model detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

No datasets were generated or analysed during the current study.

Notes

  1. https://www.liwc.app/

  2. https://pypi.org/project/deep-translator/

  3. a “perfect” feature would represent 2 well-separated clusters of colors, far away from the decision boundary.

References

  1. Garimella, K., et al.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3–1327 (2018). https://doi.org/10.1145/3140565

    Article  MathSciNet  Google Scholar 

  2. Hessel, J., Lee, L.: Something’s brewing! early prediction of controversy-causing posts from discussion features. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 1648–1659 (2019)

  3. Jacomy, M., et al.: Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS one journal 9 (2014)

  4. Benslimane, S., et al.: Controversy detection: A text and graph neural network based approach. In: 22nd Conference on Web Information Systems Engineering, 13080, pp. 339–354 (2021)

  5. Zarate, J.M.O., et al.: Measuring controversy in social networks through NLP. In: 27th International Symposium on String Processing and Information Retrieval, SPIRE, Orlando, USA, October 13-15, 2020, 12303, pp. 194–209 (2020)

  6. Iqbal, K., Khan, M.S.: Email classification analysis using machine learning techniques. Appl. Computing Inform. (2022). https://doi.org/10.1108/ACI-01-2022-0012

    Article  Google Scholar 

  7. Levy, R., et al: Context dependent claim detection. In: 25th International Conference on Computational Linguistics: Technical Papers, pp. 1489–1500

  8. Boyd, R., et al.: The development and psychometric properties of liwc-22 (2022)

  9. Nakov, P., et al.: Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets (2022)

  10. Preoţiuc-Pietro, D., et al.: Automatically identifying complaints in social media. In: ACM Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5008–5019 (2019). https://doi.org/10.18653/v1/P19-1495

  11. Koncar, P., et al.: Analysis and prediction of multilingual controversy on reddit. In: Web Science Conference 2021, pp. 215–224 (2021)

  12. Lundberg, S., Lee, S.-I.: A Unified Approach to Interpreting Model Predictions (2017)

  13. Gongane, V.U., et al.: A survey of explainable AI techniques for detection of fake news and hate speech on social media platforms https://doi.org/10.1007/s42001-024-00248-9

  14. Kozik, R., et al.: When explainability turns into a threat - using xai to fool a fake news detection method. Comput. Secur. 137, (2024). https://doi.org/10.1016/j.cose.2023.103599

  15. Gómez-Suta, M., et al.: Stance detection in tweets: A topic modeling approach supporting explainability. Expert. Syst. Appl. 214, (2023). https://doi.org/10.1016/j.eswa.2022.119046

  16. Yang, C., et al.: Efficient shapley values estimation by amortization for text classification. In: Rogers, A., Boyd-Graber, J.L., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pp. 8666–8680 (2023)

  17. Liu, Y., et al.: Diagnosis of parkinson’s disease based on shap value feature selection. Biocybernetics Biomed. Eng. 42(3), 856–869 (2022). https://doi.org/10.1016/j.bbe.2022.06.007

    Article  Google Scholar 

  18. Zavorotnyuk, D.S., et al.: Shapley value as a quality control for mass spectra of human glioblastoma tissues. Data 8(1), 21 (2023). https://doi.org/10.3390/DATA8010021

    Article  Google Scholar 

  19. Garimella, K., et al: Exploring controversy in twitter. CoRR abs/1512.05550 (2015)

  20. Garimella, K., et al.: Reducing controversy by connecting opposing views. In: Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, pp. 5249–5253 (2018)

  21. Emamgholizadeh, H., et al.: A framework for quantifying controversy of social network debates using attributed networks: biased random walk (BRW). Soc. Netw. Anal. Min. 10(1), 90 (2020)

    Article  Google Scholar 

  22. Guerra, P.H.C., et al.: A measure of polarization on social media networks based on community boundaries. In: Seventh International Conference on Weblogs and Social Media, ICWSM (2013)

  23. Mendoza, M., et al.: GENE: graph generation conditioned on named entities for polarity and controversy detection in social media. Inf. Process. Manag. 57(6), 102366 (2020)

  24. Zarate, J.M.O.D., Feuerstein, E.: Vocabulary-based method for quantifying controversy in social media. In: 25th International Conference on Conceptual Structures, ICCS, Springer, vol. 12277, pp. 161–176 (2020)

  25. Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT Conference: Human Language Technologies, Volume 1, pp. 4171–4186 (2019)

  26. Zhong, L., et al.: Integrating semantic and structural information with graph convolutional network for controversy detection. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL. Association for Computational Linguistics, ??? pp. 515–526 (2020)

  27. Zhang, S., Xie, L.: Improving attention mechanism in graph neural networks via cardinality preservation. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 1395–1402 (2020)

  28. Jang, M., Allan, J.: Explaining controversy on social media via stance summarization. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, pp. 1221–1224 (2018)

  29. Guo, J.,: Expert-guided contrastive opinion summarization for controversial issues. In: Proceedings of the 24th ACM International Conference on World Wide Web. WWW ’15 Companion, pp. 1105–1110 (2015)

  30. Coletto, M., et al.: Automatic controversy detection in social media: A content-independent motif-based approach. Online Soc. Netw. Media 3–4, 22–31 (2017)

    Article  Google Scholar 

  31. Karypis, G., Kumar, V.: Metis – unstructured graph partitioning and sparse matrix ordering system, version 2.0 (1995)

  32. Akoglu, H.: User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine 18 (2018). https://doi.org/10.1016/j.tjem.2018.08.001

  33. Field, A.: Discovering Statistics Using SPSS (and Sex and Drugs and Rock ‘n’ Roll), (2013)

  34. Marzjarani, M.: Sample size and outliers, leverage, and influential points, and cooks distance formula. (2015). https://api.semanticscholar.org/CorpusID:55026567

  35. James, G., et al.: An Introduction to Statistical Learning: with Applications in R. Springer, ??? (2013). https://faculty.marshall.usc.edu/gareth-james/ISL/

  36. Jamra, H.A., et al.: Identification of weak signals in a temporal graph of social interactions. In: IDEAS’22: International Database Engineered Applications Symposium, Budapest, Hungary, August 22 - 24, 2022, pp. 34–42 (2022)

  37. Almarzouqi, A., et al.: Prediction of user’s intention to use metaverse system in medical education: A hybrid sem-ml learning approach. IEEE access 10, 43421–43434 (2022)

    Article  Google Scholar 

  38. Mohapatra, A., et al.: Fake news detection and classification using hybrid bilstm and self-attention model. Multimedia Tools Appl. 81(13), 18503–18519 (2022)

    Article  Google Scholar 

  39. Swathi, T., et al.: An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis. Appl. Intell. 52(12), 13675–13688 (2022)

    Article  Google Scholar 

  40. Akbiyik, M.E., et al.: Ask" who", not" what": Bitcoin volatility forecasting with twitter data. In: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pp. 688–696 (2023)

  41. Masud, R., et al.: Forecasting political parties and candidates for indonesia’s presidential election in 2024 using twitter (2023)

  42. Zhang, Q., et al.: Neighborhood skyline on graphs: Concepts, algorithms and applications. In: 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3-7, 2023, pp. 585–598 (2023)

Download references

Funding

This work was supported by grants from Janssen Horizon endowment fund.

Author information

Authors and Affiliations

Authors

Contributions

Thomas Papastergiou worked on the statistical part, and wrote sections 3.4.1 and 5.4.1. Samy Benslimane worked and wrote the remaining of the manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Samy Benslimane.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical Approval

This declaration is “not applicable”.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: MEDES-IDEAS 2023

Guest Editors: Joe Tekli, Djamal Benslimane, Richard Chbeir, Yannis Manolopoulos and Ngoc-Thanh Nguyen

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Benslimane, S., Papastergiou, T., Azé, J. et al. A SHAP-based controversy analysis through communities on Twitter. World Wide Web 27, 65 (2024). https://doi.org/10.1007/s11280-024-01278-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11280-024-01278-z

Keywords

Navigation