Abstract
This paper introduces the result of Team Grenzlinie’s experiment in CONSTRAINT 2021 shared task. This task has two subtasks. Subtask1 is the COVID-19 Fake News Detection task in English, a binary classification task. This paper chooses RoBERTa as the pre-trained model, and tries to build a graph from news datasets. Finally, our system achieves an accuracy of 98.64% and an F1-score of 98.64% on the test dataset. Subtask2 is a Hostile Post Detection task in Hindi, a multi-labels task. In this task, XLM-RoBERTa is chosen as the pre-trained model. The adapted threshold is adopted to solve the data unbalanced problem, and then Bi-LSTM, LEAM, LaSO approaches are adopted to obtain more abundant semantic information. The final approach achieves the accuracy of 74.11% and weight F1-score of 81.77% on the test dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Patwa, P., et al.: Overview of constraint 2021 shared tasks: detecting English COVID-19 fake news and Hindi hostile posts. In: Chakraborty, T., et al. (eds.) CONSTRAINT 2021, CCIS 1402, pp. 42–53. Springer, Cham (2021)
Gaonkar, R., Kwon, H., Bastan, M., Balasubramanian, N., Chambers, N.: Modeling label semantics for predicting emotional reactions. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 2020, pp. 4687–4692. https://www.aclweb.org/anthology/2020.acl-main.426
Wang, G., et al.: Joint embedding of words and labels for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia: Association for Computational Linguistics, July 2018, pp. 2321–2331. https://www.aclweb.org/anthology/P18-1216
Alfassy, A., et al.: LaSo: label-set operations networks for multi-label few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6548–6557 (2019)
Wang, J., Peng, B., Zhang, X.: Using a stacked residual LSTM model for sentiment intensity prediction. Neurocomputing 322, 93–101 (2018)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Lu, Y.-J., Li, C.-T.: GCAN: graph-aware co-attention networks for explainable fake news detection on social media. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 2020, pp. 505–514. https://www.aclweb.org/anthology/2020.acl-main.48
Kar, D., Bhardwaj, M., Samanta, S., Azad, A.P.: No rumours please! a multi-indic-lingual approach for COVID fake-tweet detection. arXiv preprint arXiv:2010.06906 (2020)
Geng, X., Wang, L., Wang, X., Qin, B., Liu, T., Tu, Z.: How does selective mechanism improve self-attention networks? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 2020, pp. 2986–2995 . https://www.aclweb.org/anthology/2020.acl-main.269
Gumbel, E.J.: Statistical theory of extreme values and some practical applications: a series of lectures. US Government Printing Office, vol. 33 (1948)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144 (2016)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Patwa, P., et al.: Fighting an infodemic: COVID-19 fake news dataset. arXiv preprint arXiv:2011.03327 (2020)
Bhardwaj, M., Akhtar, M.S., Ekbal, A., Das, A., Chakraborty, T.: Hostility detection dataset in Hindi. arXiv preprint arXiv:2011.03588 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, R., Zhou, X. (2021). Extracting Latent Information from Datasets in CONSTRAINT 2021 Shared Task. In: Chakraborty, T., Shu, K., Bernard, H.R., Liu, H., Akhtar, M.S. (eds) Combating Online Hostile Posts in Regional Languages during Emergency Situation. CONSTRAINT 2021. Communications in Computer and Information Science, vol 1402. Springer, Cham. https://doi.org/10.1007/978-3-030-73696-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-73696-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73695-8
Online ISBN: 978-3-030-73696-5
eBook Packages: Computer ScienceComputer Science (R0)