Crowdsourcing Statement Classification to Enhance Information Quality Prediction

Singh, Jaspreet; Soprano, Michael; Roitero, Kevin; Ceolin, Davide

doi:10.1007/978-3-031-71210-4_5

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15175))

Included in the following conference series:

Multidisciplinary International Symposium on Disinformation in Open Online Media

234 Accesses

Abstract

This paper explores the use of crowdsourcing to classify statement types in film reviews to assess their information quality. Employing the Argument Type Identification Procedure which uses the Periodic Table of Arguments to categorize arguments, the study aims to connect statement types to the overall argument strength and information reliability. Focusing on non-expert annotators in a crowdsourcing environment, the research assesses their reliability based on various factors including language proficiency and annotation experience. Results indicate the importance of careful annotator selection and training to achieve high inter-annotator agreement and highlight challenges in crowdsourcing statement classification for information quality assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 4346; Price includes VAT (Japan)

Softcover Book: JPY 5433; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PolitePEER: does peer review hurt? A dataset to gauge politeness intensity in the peer reviews

Article 14 May 2023

The SFU Opinion and Comments Corpus: A Corpus for the Analysis of Online News Comments

Article Open access 02 November 2019

Ongoing Efforts: Toward Behaviour-Based Corpus Evaluation

Notes

1.
See, for instance, www.rottentomatoes.com/m/split_2017.

References

Addawood, A., Bashir, M.: “What is your evidence?” A study of controversial topics on social media. In: ArgMining2016, pp. 1–11. ACL, August 2016
Google Scholar
Bosc, T., Cabrio, E., Villata, S.: DART: a dataset of arguments and their relations on Twitter. In: LREC, pp. 1258–1263. ACL (2016)
Google Scholar
Ceolin, D., Primiero, G., Soprano, M., Wielemaker, J.: Transparent assessment of information quality of online reviews using formal argumentation theory. Inf. Syst. 110, 102107 (2022)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20, 37–46 (1960)
Article Google Scholar
Dancey, C.P., Reidy, J.: Statistics Without Maths for Psychology: Using SPSS for Windows. Prentice-Hall Inc., USA (2004)
Google Scholar
Feier, A.: Reach consensus faster by using IAA charts in the annotation lab. https://www.johnsnowlabs.com/reach-consensus-faster-by-using-iaa-charts-in-the-annotation-lab/. Accessed 03 Apr 2023
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971)
Article Google Scholar
Goudas, T., Louizos, C., Petasis, G., Karkaletsis, V.: Argument extraction from news, blogs, and social media. In: Likas, A., Blekas, K., Kalles, D. (eds.) SETN 2014. LNCS (LNAI), vol. 8445, pp. 287–299. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07064-3_23
Chapter Google Scholar
Hripcsak, G., Rothschild, A.S.: Agreement, the F-measure, and reliability in information retrieval. J. Am. Med. Inf. Ass. 12, 296–298 (2005)
Google Scholar
Iskender, N., Schaefer, R., Polzehl, T., Möller, S.: Argument mining in tweets: comparing crowd and expert annotations for automated claim and evidence detection. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds.) NLDB 2021. LNCS, vol. 12801, pp. 275–288. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80599-9_25
Chapter Google Scholar
Jiang, Y., Zhu, H., Kummerfeld, J.K., Li, Y., Lasecki, W.: A novel workflow for accurately and efficiently crowdsourcing predicate senses and argument labels. In: EMNLP, pp. 415–421. ACL, November 2020
Google Scholar
Lawrence, J., Reed, C.: Argument mining: a survey. Comput. Linguist. 45(4), 765–818 (2019)
Article Google Scholar
Lee, G.E., Sun, A.: A study on agreement in PICO span annotations. In: Proceedings of the 42nd International ACM SIGIR Conference, pp. 1149–1152. ACM (2019)
Google Scholar
Li, M., Geng, S., Gao, Y., Peng, S., Liu, H., Wang, H.: Crowdsourcing argumentation structures in Chinese hotel reviews. In: SMC, pp. 87–92. IEEE (2017)
Google Scholar
Lindahl, A.: Annotating argumentation in Swedish social media. In: ArgMining Workshop, pp. 100–105. ACL, December 2020
Google Scholar
McHugh, M.L.: Interrater reliability: the kappa statistic. Biochemia Medica 22, 276–282 (2012)
Article Google Scholar
Miller, T., Sukhareva, M., Gurevych, I.: A streamlined method for sourcing discourse-level argumentation annotations from the crowd. In: NAACL, pp. 1790–1796. ACL, June 2019
Google Scholar
Nordquist, R.: Definition and examples of vagueness in language. https://www.thoughtco.com/vagueness-language-1692483. Accessed 25 Sept 2023
Plug, H., Wagemans, J.: Argument-checken als een methode voor het identificeren van desinformatie. In: Proceedings of VIOT2024. University of Twente (2024)
Google Scholar
Ratner, B.: The correlation coefficient: Its values range between +1/-1 or do they? J. Targ. Measur. Anal. Market. 17(2), 139–142 (2009)
Google Scholar
Schaefer, R., Stede, M.: Annotation and detection of arguments in tweets. In: Proceedings of the 7th Workshop on Argument Mining, pp. 53–58. ACL, December 2020
Google Scholar
Schober, P., Boer, C., Schwarte, L.A.: Correlation coefficients: appropriate use and interpretation. Anesth. Analg. 126(5) (2018)
Google Scholar
Scott, W.A.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19(3), 321–325 (1955)
Article Google Scholar
Soprano, M., Roitero, K., Bombassei De Bona, F., Mizzaro, S.: Crowd_frame: a simple and complete framework to deploy complex crowdsourcing tasks off-the-shelf. In: WSDM 2022, pp. 1605–1608. ACM (2022)
Google Scholar
Soprano, M., et al.: The many dimensions of truthfulness: crowdsourcing misinformation assessments on a multidimensional scale. IP &M 58(6), 102710 (2021)
Google Scholar
Wagemans, J.: Constructing a periodic table of arguments. In: Argumentation, Objectivity, and Bias: Proceedings of the 11th International Conference of the OSSA, pp. 1–12, May 2016
Google Scholar
Wagemans, J.H.M.: Argument Type Identification Procedure (ATIP) - Version 4. https://periodic-table-of-arguments.org/argument-type-identification-procedure. Accessed 21 Mar 2022
World Health Organization: Infodemic. https://www.who.int/health-topics/infodemic/understanding-the-infodemic-and-misinformation-in-the-fight-against-covid-19. Accessed 29 Apr 2024

Download references

Acknowledgements

This research is supported by the Netherlands eScience Center project “The Eye of the Beholder” (project nr. 027.020.G15), and it is part of the AI, Media & Democracy Lab (Dutch Research Council project number: NWA.1332.20.009). For more information about the lab and its further activities, visit https://www.aim4dem.nl/.

This research is also supported by the European Union’s NextGenerationEU PNRR M4.C2.1.1 – PRIN 2022 project “20227F2ZN3 MoT–The Measure of Truth: An Evaluation-Centered Machine-Human Hybrid Framework for Assessing Information Truthfulness” (20227F2ZN3_001, CUP G53D23002800006), and by the Strategic Plan of the University of Udine–Interdepartmental Project on Artificial Intelligence (2020–2025).

Author information

Authors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jaspreet Singh
University of Udine, Udine, Italy
Michael Soprano & Kevin Roitero
Centrum Wiskunde & Informatica (CWI), Amsterdam, The Netherlands
Davide Ceolin

Authors

Jaspreet Singh
View author publications
You can also search for this author in PubMed Google Scholar
Michael Soprano
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Roitero
View author publications
You can also search for this author in PubMed Google Scholar
Davide Ceolin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Ceolin .

Editor information

Editors and Affiliations

Leiden University, Leiden, The Netherlands
Mike Preuss
University of Twente, Enschede, The Netherlands
Agata Leszkiewicz
University of Calgary, Calgary, AB, Canada
Jean-Christopher Boucher
King's College London, London, UK
Ofer Fridman
University of Münster, Münster, Germany
Lucas Stampe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, J., Soprano, M., Roitero, K., Ceolin, D. (2024). Crowdsourcing Statement Classification to Enhance Information Quality Prediction. In: Preuss, M., Leszkiewicz, A., Boucher, JC., Fridman, O., Stampe, L. (eds) Disinformation in Open Online Media. MISDOOM 2024. Lecture Notes in Computer Science, vol 15175. Springer, Cham. https://doi.org/10.1007/978-3-031-71210-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-71210-4_5
Published: 31 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71209-8
Online ISBN: 978-3-031-71210-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Crowdsourcing Statement Classification to Enhance Information Quality Prediction