A SHAP-based controversy analysis through communities on Twitter

Benslimane, Samy; Papastergiou, Thomas; Azé, Jérôme; Bringay, Sandra; Servajean, Maximilien; Mollevi, Caroline

doi:10.1007/s11280-024-01278-z

A SHAP-based controversy analysis through communities on Twitter

Published: 14 September 2024

Volume 27, article number 65, (2024)
Cite this article

World Wide Web Aims and scope Submit manuscript

Samy Benslimane¹,
Thomas Papastergiou²,
Jérôme Azé¹,
Sandra Bringay^1,3,
Maximilien Servajean^1,3 &
…
Caroline Mollevi^4,5

136 Accesses
Explore all metrics

Abstract

Controversy encompasses content that draws diverse perspectives, along with positive and negative feedback on a specific event, resulting in the formation of distinct user communities. we explore the explainability of controversy through the lens of SHAP (SHapley Additive exPlanations) method, aiming to provide a fair assessment of the individual contributions of different text features of tweets to controversy detection. We conduct an analysis of topic discussions on Twitter from a community perspective, investigating the role of text in accurately classifying tweets into their respective communities. To achieve this, we introduce a SHAP-based pipeline designed to quantify the influence of impactful text features on the predictions of three tweet classifiers. Text content alone offers interesting controversy detection accuracy. It can contain predictive features for controversy detection. For instance, negative connotations, pejorative tendencies and positive qualifying adjectives tend to impact the controversy model detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Measuring Controversy in Social Networks Through NLP

Quantifying controversy from stance, sentiment, offensiveness and sarcasm: a fine-grained controversy intensity measurement framework on a Chinese dataset

Article 09 August 2023

A Probabilistic Author-Centered Model for Twitter Discussions

Data Availability

No datasets were generated or analysed during the current study.

Notes

https://www.liwc.app/
https://pypi.org/project/deep-translator/
a “perfect” feature would represent 2 well-separated clusters of colors, far away from the decision boundary.

References

Garimella, K., et al.: Quantifying controversy on social media. ACM Trans. Soc. Comput. 1(1), 3–1327 (2018). https://doi.org/10.1145/3140565
Article MathSciNet Google Scholar
Hessel, J., Lee, L.: Something’s brewing! early prediction of controversy-causing posts from discussion features. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 1648–1659 (2019)
Jacomy, M., et al.: Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS one journal 9 (2014)
Benslimane, S., et al.: Controversy detection: A text and graph neural network based approach. In: 22nd Conference on Web Information Systems Engineering, 13080, pp. 339–354 (2021)
Zarate, J.M.O., et al.: Measuring controversy in social networks through NLP. In: 27th International Symposium on String Processing and Information Retrieval, SPIRE, Orlando, USA, October 13-15, 2020, 12303, pp. 194–209 (2020)
Iqbal, K., Khan, M.S.: Email classification analysis using machine learning techniques. Appl. Computing Inform. (2022). https://doi.org/10.1108/ACI-01-2022-0012
Article Google Scholar
Levy, R., et al: Context dependent claim detection. In: 25th International Conference on Computational Linguistics: Technical Papers, pp. 1489–1500
Boyd, R., et al.: The development and psychometric properties of liwc-22 (2022)
Nakov, P., et al.: Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets (2022)
Preoţiuc-Pietro, D., et al.: Automatically identifying complaints in social media. In: ACM Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5008–5019 (2019). https://doi.org/10.18653/v1/P19-1495
Koncar, P., et al.: Analysis and prediction of multilingual controversy on reddit. In: Web Science Conference 2021, pp. 215–224 (2021)
Lundberg, S., Lee, S.-I.: A Unified Approach to Interpreting Model Predictions (2017)
Gongane, V.U., et al.: A survey of explainable AI techniques for detection of fake news and hate speech on social media platforms https://doi.org/10.1007/s42001-024-00248-9
Kozik, R., et al.: When explainability turns into a threat - using xai to fool a fake news detection method. Comput. Secur. 137, (2024). https://doi.org/10.1016/j.cose.2023.103599
Gómez-Suta, M., et al.: Stance detection in tweets: A topic modeling approach supporting explainability. Expert. Syst. Appl. 214, (2023). https://doi.org/10.1016/j.eswa.2022.119046
Yang, C., et al.: Efficient shapley values estimation by amortization for text classification. In: Rogers, A., Boyd-Graber, J.L., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pp. 8666–8680 (2023)
Liu, Y., et al.: Diagnosis of parkinson’s disease based on shap value feature selection. Biocybernetics Biomed. Eng. 42(3), 856–869 (2022). https://doi.org/10.1016/j.bbe.2022.06.007
Article Google Scholar
Zavorotnyuk, D.S., et al.: Shapley value as a quality control for mass spectra of human glioblastoma tissues. Data 8(1), 21 (2023). https://doi.org/10.3390/DATA8010021
Article Google Scholar
Garimella, K., et al: Exploring controversy in twitter. CoRR abs/1512.05550 (2015)
Garimella, K., et al.: Reducing controversy by connecting opposing views. In: Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, pp. 5249–5253 (2018)
Emamgholizadeh, H., et al.: A framework for quantifying controversy of social network debates using attributed networks: biased random walk (BRW). Soc. Netw. Anal. Min. 10(1), 90 (2020)
Article Google Scholar
Guerra, P.H.C., et al.: A measure of polarization on social media networks based on community boundaries. In: Seventh International Conference on Weblogs and Social Media, ICWSM (2013)
Mendoza, M., et al.: GENE: graph generation conditioned on named entities for polarity and controversy detection in social media. Inf. Process. Manag. 57(6), 102366 (2020)
Zarate, J.M.O.D., Feuerstein, E.: Vocabulary-based method for quantifying controversy in social media. In: 25th International Conference on Conceptual Structures, ICCS, Springer, vol. 12277, pp. 161–176 (2020)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT Conference: Human Language Technologies, Volume 1, pp. 4171–4186 (2019)
Zhong, L., et al.: Integrating semantic and structural information with graph convolutional network for controversy detection. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL. Association for Computational Linguistics, ??? pp. 515–526 (2020)
Zhang, S., Xie, L.: Improving attention mechanism in graph neural networks via cardinality preservation. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 1395–1402 (2020)
Jang, M., Allan, J.: Explaining controversy on social media via stance summarization. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, pp. 1221–1224 (2018)
Guo, J.,: Expert-guided contrastive opinion summarization for controversial issues. In: Proceedings of the 24th ACM International Conference on World Wide Web. WWW ’15 Companion, pp. 1105–1110 (2015)
Coletto, M., et al.: Automatic controversy detection in social media: A content-independent motif-based approach. Online Soc. Netw. Media 3–4, 22–31 (2017)
Article Google Scholar
Karypis, G., Kumar, V.: Metis – unstructured graph partitioning and sparse matrix ordering system, version 2.0 (1995)
Akoglu, H.: User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine 18 (2018). https://doi.org/10.1016/j.tjem.2018.08.001
Field, A.: Discovering Statistics Using SPSS (and Sex and Drugs and Rock ‘n’ Roll), (2013)
Marzjarani, M.: Sample size and outliers, leverage, and influential points, and cooks distance formula. (2015). https://api.semanticscholar.org/CorpusID:55026567
James, G., et al.: An Introduction to Statistical Learning: with Applications in R. Springer, ??? (2013). https://faculty.marshall.usc.edu/gareth-james/ISL/
Jamra, H.A., et al.: Identification of weak signals in a temporal graph of social interactions. In: IDEAS’22: International Database Engineered Applications Symposium, Budapest, Hungary, August 22 - 24, 2022, pp. 34–42 (2022)
Almarzouqi, A., et al.: Prediction of user’s intention to use metaverse system in medical education: A hybrid sem-ml learning approach. IEEE access 10, 43421–43434 (2022)
Article Google Scholar
Mohapatra, A., et al.: Fake news detection and classification using hybrid bilstm and self-attention model. Multimedia Tools Appl. 81(13), 18503–18519 (2022)
Article Google Scholar
Swathi, T., et al.: An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis. Appl. Intell. 52(12), 13675–13688 (2022)
Article Google Scholar
Akbiyik, M.E., et al.: Ask" who", not" what": Bitcoin volatility forecasting with twitter data. In: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pp. 688–696 (2023)
Masud, R., et al.: Forecasting political parties and candidates for indonesia’s presidential election in 2024 using twitter (2023)
Zhang, Q., et al.: Neighborhood skyline on graphs: Concepts, algorithms and applications. In: 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3-7, 2023, pp. 585–598 (2023)

Download references

Funding

This work was supported by grants from Janssen Horizon endowment fund.

Author information

Authors and Affiliations

LIRMM, Univ Montpellier, CNRS, Montpellier, France
Samy Benslimane, Jérôme Azé, Sandra Bringay & Maximilien Servajean
LIPN UMR 7030, University of Sorbonne Paris Nord, 93430, Villetaneuse, France
Thomas Papastergiou
AMIS, Paul-Valéry University, Montpellier, France
Sandra Bringay & Maximilien Servajean
Institut du Cancer Montpellier (ICM), Montpellier, France
Caroline Mollevi
IDESP, UMR Inserm - Univ Montpellier, Montpellier, France
Caroline Mollevi

Authors

Samy Benslimane
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Papastergiou
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Azé
View author publications
You can also search for this author in PubMed Google Scholar
Sandra Bringay
View author publications
You can also search for this author in PubMed Google Scholar
Maximilien Servajean
View author publications
You can also search for this author in PubMed Google Scholar
Caroline Mollevi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Thomas Papastergiou worked on the statistical part, and wrote sections 3.4.1 and 5.4.1. Samy Benslimane worked and wrote the remaining of the manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Samy Benslimane.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical Approval

This declaration is “not applicable”.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: MEDES-IDEAS 2023

Guest Editors: Joe Tekli, Djamal Benslimane, Richard Chbeir, Yannis Manolopoulos and Ngoc-Thanh Nguyen

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Benslimane, S., Papastergiou, T., Azé, J. et al. A SHAP-based controversy analysis through communities on Twitter. World Wide Web 27, 65 (2024). https://doi.org/10.1007/s11280-024-01278-z

Download citation

Received: 29 January 2024
Revised: 01 May 2024
Accepted: 27 May 2024
Published: 14 September 2024
DOI: https://doi.org/10.1007/s11280-024-01278-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A SHAP-based controversy analysis through communities on Twitter

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Measuring Controversy in Social Networks Through NLP

Quantifying controversy from stance, sentiment, offensiveness and sarcasm: a fine-grained controversy intensity measurement framework on a Chinese dataset

A Probabilistic Author-Centered Model for Twitter Discussions

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A SHAP-based controversy analysis through communities on Twitter

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Measuring Controversy in Social Networks Through NLP

Quantifying controversy from stance, sentiment, offensiveness and sarcasm: a fine-grained controversy intensity measurement framework on a Chinese dataset

A Probabilistic Author-Centered Model for Twitter Discussions

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation