We present an approach to extracting arguments from social media, exemplified by a case study on a large corpus of Twitter messages collected under the #Brexit hashtag during the run-up to the referendum in 2016. Our method is based on constructing dedicated corpus queries that capture predefined argumentation patterns following standard Walton-style argumentation schemes. Query matches are transformed directly into logical patterns, i. e. formulae with placeholders in a general form of modal logic. We prioritize precision over recall, exploiting the fact that the sheer size of the corpus still delivers substantial numbers of matches for all patterns, and with the goal of eventually gaining an overview of widely-used arguments and argumentation schemes. We evaluate our approach in terms of recall on a manually annotated gold standard of 1000 randomly selected tweets for three selected high-frequency patterns. We also estimate precision by manual inspection of query matches in the entire corpus. Both evaluations are accompanied by an analysis of inter-annotator agreement between three independent judges.
About the authors
Natalie Dykes is working in Stefan Evert’s Computational Corpus Linguistics group. She holds a B. A. in computational linguistics, Scandinavian studies and an M. A. in linguistics. Her research interests include corpus-based discourse analysis, argumentation, and computer-mediated communication.
Stefan Evert holds the Chair of Computational Corpus Linguistics at FAU Erlangen-Nürnberg. After studying mathematics, physics and English linguistics, he received a PhD degree in computational linguistics from the University of Stuttgart. His research interests include the statistical methodology of corpus linguistics, co-occurrence phenomena and software tools for processing large text corpora.
After receiving his B. Sc. in Computer Science and Media from the TH-Nürnberg in 2015 Merlin Göttlinger continued with an M. Sc. in Computer Science at FAU Erlangen-Nürnberg which he completed in 2018. Afterwards, he started working as a PhD student at the Chair of Theoretical Computer Science (INF8) at FAU Erlangen-Nürnberg researching logic formalism for argumentation.
Philipp Heinrich is working in Stefan Evert’s Computational Corpus Linguistics group. Having studied mathematics, linguistics, and philosophy, his research interests include corpus-based discourse analysis and argumentation mining with a focus on the comparison of social and mass media.
Lutz Schröder holds the chair for theoretical computer science at FAU Erlangen-Nürnberg. He received a PhD in mathematics and subsequently the habilitation in computer science from the University of Bremen, and has held a senior researcher position at the German Research Center for Artificial Intelligence (DFKI). His main research area is logic in computer science.
