EmoffMeme: identifying offensive memes by leveraging underlying emotions | Multimedia Tools and Applications Skip to main content
Log in

EmoffMeme: identifying offensive memes by leveraging underlying emotions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Facebook, Twitter, Instagram, and other social media sites allow anonymity and independence. People exert their right to free expression without fear of repercussions. However, in the absence of thorough surveillance, people have fallen prey to offensiveness, trolls, and social media predators. Memes, a type of multimodal media, are becoming increasingly popular online. While most memes are meant to be humorous, some use dark humor to disseminate offensive content. Our present research focuses on learning the dependency and correlation between the three tasks, viz., detecting offensive memes, classifying offensive memes into fine-grained categories, and detecting emotions in a meme. For this, we created EmoffMeme, a large-scale multimodal dataset for Hindi. We aim at gaining insight into hidden social media users’ emotions by studying the meme’s text and image. We present an end-to-end multitasking deep neural network-based CLIP (Contrastive Language-Image Pre-training) model to solve the above correlated tasks simultaneously. We also employ Multimodal Factorized Bilinear (MFB) pooling to incorporate one common portrayal of a meme’s textual and visual part. We demonstrated the effectiveness of our work through extensive experiments. The evaluation shows that the proposed multitask framework yields better performance for the primary task, i.e., offensiveness identification, with the help of secondary task, i.e., emotion analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The dataset generated during and analysed during the current study are available in the journal1_memes-A48B repository at the link: https://github.com/Gitanjali1801/EmoffMeme.git.

Code Availability

The code of the current study is available at the link: https://github.com/Gitanjali1801/EmoffMeme.git.

Notes

  1. 1 To maintain the anonymity of any individual, we replaced actual name with Person-XYZ throughout the paper.

  2. https://download-all-images.mobilefirst.me/

  3. https://github.com/tesseract-ocr/tesseract

  4. https://github.com/FreddeFrallan/Multilingual-CLIP

  5. https://pytorch.org/

  6. https://github.com/google-research/bert/blob/master/multilingual.md

  7. Our created corpus has textual part in Hindi. But VisualBERT and LXMERT are pre-trained on English corpus. So for these models only, we translated Hindi text part from our dataset into English with Google Translator and then used that translated text for training the model.

References

  1. Akhtar S, Ghosal D, Ekbal A, Bhattacharyya P, Kurohashi S (2022) All-in-one: emotion, sentiment and intensity prediction using a multi-task ensemble framework. IEEE Trans Affect Comput 13:285–297

    Article  Google Scholar 

  2. Bayerl PS, Paul KI (2011) What determines inter-coder agreement in manual annotations? a meta-analytic investigation. Comput Linguis 37(4):699–725. https://doi.org/10.1162/COLI_a_00074

    Article  Google Scholar 

  3. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguis 5:135–146

    Article  Google Scholar 

  4. Boland K, Wira-Alam A, Messerschmidt R (2013) Creating an annotated corpus for sentiment analysis of german product reviews

  5. Caruana R (2004) Multitask learning. Mach Learn 28:41–75

    Article  Google Scholar 

  6. Castro S, Hazarika D, Pérez-Rosas V, Zimmermann R, Mihalcea R, Poria S (2019) Towards multimodal sarcasm detection (an _obviously_ perfect paper), CoRR arXiv:1906.01815

  7. Chatzakou D, Kourtellis N, Blackburn J, De Cristofaro E, Stringhini G, Vakali A (2017) Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference. WebSci ’17, pp 13–22. Association for computing machinery. https://doi.org/10.1145/3091478.3091487

  8. Chatzakou D, Leontiadis I, Blackburn J, Cristofaro ED, Stringhini G, Vakali A, Kourtellis N (2019) Detecting cyberbullying and cyberaggression in social media. ACM Trans Web, vol 13(3). https://doi.org/10.1145/3343484

  9. Chauhan DS, SR D, Ekbal A, Bhattacharyya P (2020) Sentiment and emotion help sarcasm? a multi-task learning framework for multi-modal sarcasm, sentiment and emotion analysis. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, pp 4351–4360. https://doi.org/10.18653/v1/2020.acl-main.401. https://aclanthology.org/2020.acl-main.401

  10. Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 International conference on privacy, security, risk and trust and 2012 international confernece on social computing, pp 71–80. https://doi.org/10.1109/SocialCom-PASSAT.2012.55

  11. Cheng L, Li J, Silva Y, Hall D, Liu H (2018) Xbully: cyberbullying detection within a multi-modal context. https://doi.org/10.1145/3289600.3291037

  12. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for computational linguistics, pp 1724–1734. https://doi.org/10.3115/v1/D14-1179. https://aclanthology.org/D14-1179

  13. Culpeper J (2011) Impoliteness: using language to cause offence. Studies in interactional sociolinguistics. Cambridge University Press. https://doi.org/10.1017/CBO9780511975752

  14. Dadvar M, Trieschnigg D, Ordelman R, De Jong F (2013) Improving cyberbullying detection with user context. In: Serdyukov P, Braslavski P, Kuznetsov SO, Kamps J, Rüger S, Agichtein E, Segalovich I, Yilmaz E (eds) Advances in information retrieval. Springer, pp 693–696

  15. Demszky D, Movshovitz-Attias D, Ko J, Cowen AS, Nemade G, Ravi S (2020) Goemotions: a dataset of fine-grained emotions. CoRR arXiv:2005.00547

  16. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  17. Dey R, Salem FM (2017) Gate-Variants of gated recurrent unit (gru) neural networks. arXiv:1701.05923. https://doi.org/10.48550/ARXIV.1701.05923

  18. Dieber J, Kirrane S (2020) Why model why? assessing the strengths and limitations of lim. arXiv:2012.00093

  19. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929

  20. Drakett J, Rickett B, Day K, Milnes K (2018) Old jokes, new media – online sexism and constructions of gender in internet memes. Feminism Psychol 28(1):109–127. https://doi.org/10.1177/0959353517727560

    Article  Google Scholar 

  21. Drakett J, Rickett B, Day K, Milnes K (2018) Old jokes, new media – online sexism and constructions of gender in internet memes. Feminism Psychol 28:109–127

    Article  Google Scholar 

  22. Duan L, Cui G, Gao W, Zhang H (2001) Adult image detection method base-on skin color model and support vector machine

  23. Ekman P, Cordaro DT (2011) What is meant by calling emotions basic. Emot Rev 3:364–370

    Article  Google Scholar 

  24. Fukui A, Park DH, Yang D, Rohrbach A, Darrell T, Rohrbach M (2016) Multimodal compact bilinear pooling for visual question answering and visual grounding. In: Proceedings of the 2016 conference on empirical methods in natural language processing. Association for computational linguistics, pp 457–468. https://doi.org/10.18653/v1/D16-1044. https://aclanthology.org/D16-1044

  25. Gandhi S, Kokkula S, Chaudhuri A, Magnani A, Stanley T, Ahmadi B, Kandaswamy V, Ovenc O, Mannor S (2019) Image matters: detecting offensive and non-compliant content / logo in product images. arXiv:1905.02234

  26. Ganguly D, Mofrad MH, Kovashka A (2017) Detecting sexually provocative images. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 660–668

  27. He S, Zheng X, Wang J, Chang Z, Luo Y, Zeng D (2016) Meme extraction and tracing in crisis events. In: 2016 IEEE conference on intelligence and security informatics (ISI). IEEE Press, pp 61–66, https://doi.org/10.1109/ISI.2016.7745444

  28. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735. https://direct.mit.edu/neco/article-pdf/9/8/1735/813796/neco.1997.9.8.1735.pdf

    Article  Google Scholar 

  29. Hu A, Flaxman S (2018) Multimodal sentiment analysis to explore the structure of emotions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery &; data mining. KDD ’18. Association for computing machinery, pp 350–358, https://doi.org/10.1145/3219819.3219853

  30. Hu W, Wu O, Chen Z, Fu Z, Maybank SJ (2007) Recognition of pornographic web pages by classifying texts and images. IEEE Trans Pattern Anal Mach Intell 29:1019–1034

    Article  Google Scholar 

  31. Kiela D, Firooz H, Mohan A, Goswami V, Singh A, Ringshia P, Testuggine D (2020) The hateful memes challenge: detecting hate speech in multimodal memes. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H (eds) Advances in neural information processing systems. Curran Associates, Inc, vol 33, pp 2611–2624. https://proceedings.neurips.cc/paper/2020/file/1b84c4cee2b8b3d823b30e2d604b1878-Paper.pdf

  32. Kosti R, Alvarez JM, Recasens A, Lapedriza A (2017) Emotic: emotions in context dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) workshops

  33. Krippendorff k (2011) Computing krippendorff’s alpha-reliability

  34. Kumar R, Ojha AK, Malmasi S, Zampieri M (2018) Benchmarking aggression identification in social media. In: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018). Association for computational linguistics, pp 1–11. https://aclanthology.org/W18-4401

  35. Li W, Li Y, Liu W, Wang C (2022) An influence maximization method based on crowd emotion under an emotion-based attribute social network. Inf Process Manage, vol 59(2). https://doi.org/10.1016/j.ipm.2021.102818

  36. Li LH, Yatskar M, Yin D, Hsieh C-J, Chang K-W (2019) VisualBERT: a simple and performant baseline for vision and language. arXiv:1908.03557. https://doi.org/10.48550/ARXIV.1908.03557

  37. Malmasi S, Zampieri M (2018) Challenges in discriminating profanity from hate speech. J Exp Theo Artif Intell 30(2):187–202

    Article  Google Scholar 

  38. McCloud S (1994) Understanding comics: the invisible art. 1st HarperPerennial ed. New York HarperPerennial

  39. Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web. WWW ’16. International world wide web conferences steering committee, pp 145–153. https://doi.org/10.1145/2872427.2883062

  40. Öhman E (2020) Emotion annotation: rethinking emotion categorization. In: DHN post-proceedings

  41. Plutchik R (2001) The nature of emotions. Am Sci 89(4):344. https://doi.org/10.1511/2001.4.344

    Article  Google Scholar 

  42. Prajwal KR, Jawahar CV, Kumaraguru P (2019) Towards increased accessibility of meme images with the help of rich face emotion captions. In: Proceedings of the 27th ACM international conference on multimedia. MM ’19. Association for computing machinery, pp 202–210, https://doi.org/10.1145/3343031.3350939

  43. (1987) Quantification of agreement in psychiatric diagnosis revisited. In: Archives of General Psychiatry, vol 44:2

  44. Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning. Proceedings of machine learning research, PMLR, vol 139, pp 8748–8763. https://proceedings.mlr.press/v139/radford21a.html

  45. Roberts K, Roach MA, Johnson J, Guthrie J, Harabagiu SM (2012) Empatweet: annotating and detecting emotions on twitter. In: Proceedings of the eighth international conference on language resources and evaluation (LREC’12). European language resources association (ELRA), pp 3806–3813. http://www.lrec-conf.org/proceedings/lrec2012/pdf/201_Paper.pdf

  46. Rosenthal S, Atanasova P, Karadzhov G, Zampieri M, Nakov P (2021) Solid: a large-scale semi-supervised dataset for offensive language identification. In: Findings

  47. Sharma C, Bhageria D, Scott W, PYKL S, Das A, Chakraborty T, Pulabaigari V, Gambäck B (2020) SemEval-2020 task 8: memotion analysis- the visuo-lingual metaphor!. In: Proceedings of the fourteenth workshop on semantic evaluation, pp 759–773. International committee for computational linguistics. https://doi.org/10.18653/v1/2020.semeval-1.99. https://aclanthology.org/2020.semeval-1.99

  48. Shaver PR, Schwartz JC, Kirson D, O’Connor C (1987) Emotion knowledge: further exploration of a prototype approach. J Pers Soc Psychol 52(6):1061–86

    Article  Google Scholar 

  49. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  50. Singh P, Lefever E (2021) LT3 at SemEval-2021 task 6: Using multi-modal compact bilinear pooling to combine visual and textual understanding in memes. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for computational linguistics, pp 1051–1055. https://doi.org/10.18653/v1/2021.semeval-1.145. https://aclanthology.org/2021.semeval-1.145

  51. Suryawanshi S, Chakravarthi BR, Arcan M, Buitelaar P (2020) Multimodal meme dataset (multiOFF) for identifying offensive content in image and text. In: Proceedings of the second workshop on trolling, aggression and cyberbullying. European language resources association (ELRA), pp 32–41. https://aclanthology.org/2020.trac-1.6

  52. Søgaard A, Goldberg Y (2016) Deep multi-task learning with low level tasks supervised at lower layers. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: short papers). Association for computational linguistics, pp 231–235. https://doi.org/10.18653/v1/P16-2038. https://aclanthology.org/P16-2038

  53. Tan H, Bansal M (2019) LXMERT: learning cross-modality encoder representations from transformers. arXiv:1908.07490. https://doi.org/10.48550/ARXIV.1908.07490

  54. Tran HN, Cambria E (2018) Ensemble application of ELM and GPU for real-time multimodal sentiment analysis. Memetic Comput 10(1):3–13. https://doi.org/10.1007/s12293-017-0228-3

    Article  Google Scholar 

  55. Van Hee C, Jacobs G, Emmery C, Desmet B, Lefever E, Verhoeven B, De Pauw G, Daelemans W, Hoste V (2018) Automatic detection of cyberbullying in social media text. Plos One 13(10):1–22. https://doi.org/10.1371/journal.pone.0203794

    Google Scholar 

  56. Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. Association for computational linguistics, pp 88–93. https://doi.org/10.18653/v1/N16-2013. https://aclanthology.org/N16-2013

  57. Wiegand M, Siegel M (2018) Overview of the germeval 2018 shared task on the identification of offensive language

  58. Xu J-M, Jun K-S, Zhu X, Bellmore A (2012) Learning from bullying traces in social media. In: Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: human language technologies. Association for computational linguistics, Al, Canada, pp 656–666. https://aclanthology.org/N12-1084

  59. Yoon I (2016) Why is it not just a joke? analysis of internet memes associated with racism and hidden ideology of colorblindness

  60. Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) SemEval-2019 task 6: identifying and categorizing offensive language in social media (OffensEval). In: Proceedings of the 13th international workshop on semantic evaluation. Association for computational linguistics, pp 75–86, https://doi.org/10.18653/v1/S19-2010. https://aclanthology.org/S19-2010

  61. Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) Predicting the type and target of offensive posts in social media. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for computational linguistics, pp 1415–1420. https://doi.org/10.18653/v1/N19-1144. https://aclanthology.org/N19-1144

  62. Zhang W, Liu G, Li Z, Zhu F (2020) Hateful memes detection via complementary visual and linguistic networks. arXiv:2012.04977

  63. Zhou Y, Chen Z (2020) Multimodal learning for hateful memes detection. arXiv:2011.12870

  64. Zhu R (2020) Enhance multimodal transformer with external label and in-domain pretrain: hateful meme challenge winning solution. arXiv:2012.08290

Download references

Funding

The authors gratefully acknowledge the project “HELIOS - Hate, Hyperpartisan, and Hyperpluralism Elicitation and Observer System“, sponsored by Wipro AI.

Author information

Authors and Affiliations

Authors

Contributions

Gitanjali Kumari: Corpus creation, Algorithm design, Implementation, Experiments, Analysis, Writing - original draft. Dibyanayan Bandyopadhyay: Implementation, Experiments, Analysis, Writing - original draft. Asif Ekbal: Supervision, Algorithm conceptualization,

Corresponding author

Correspondence to Gitanjali Kumari.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interests about the work reported in this paper.

Conflict of Interests

1. Individual Privacy: To maintain the anonymity of any individual, we replaced the actual name with Person-XYZ throughout the paper. In addition, we also tried to anonymize the known faces presented in the visual part of the meme by masking them. We have masked these faces only to maintain the anonymity issues in the paper. During the implementation, we used the original image.

2. Biases: Detecting and removing political and religious biases is an extensive research area. However, previous annotation studies show that we cannot correctly remove bias and subjectivity from the annotation process despite having some form of annotation scheme. However, any biases detected in our dataset are unintentional, and we have no intention of harming any individual or group. We ensure that our data collection is generated equally and comparably in order to answer any political and religious bias queries. Furthermore, we ensure that the topic includes various issues relevant in the Indian context over the last seven years by using a keyword-based data-gathering technique. Moreover, we made sure that the terms included were inclusive of all the conceivable politicians, political organizations, young politicians, extreme groups, and religions and were not prejudiced against any one group. Based on previous work done by to remove biases from the dataset during annotation, in our dataset, annotators were strictly instructed not to make decisions based on what they believe but on what the social media user wants to transmit through that meme.

3. Misuse Potential: We suggest that researchers be aware that our dataset might be abused to filter the memes based on prejudices that may or may not be connected to demographics or other textual information. To prevent this from happening, human intervention with moderation would be essential.

4. Intended Use: Our dataset is presented to encourage research into studying humorous memes on the internet. We believe that it represents a valuable resource when used appropriately.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumari, G., Bandyopadhyay, D. & Ekbal, A. EmoffMeme: identifying offensive memes by leveraging underlying emotions. Multimed Tools Appl 82, 45061–45096 (2023). https://doi.org/10.1007/s11042-023-14807-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14807-1

Keywords

Navigation