Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments

Antal, Margit; Buza, Krisztian; Nemes, Szilárd

doi:10.1007/978-3-031-70248-8_19

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2165))

Included in the following conference series:

International Conference on Computational Collective Intelligence

283 Accesses

Abstract

Text subjectivity is an important research topic due to its applications in various domains such as sentiment analysis, opinion mining, social media monitoring, clinical research and patient feedback analysis. While rule-based approaches dominated this field at the beginning of the 21st century, contemporary works rely on transformers, a specific neural network architecture designed for language modeling. This paper explores the performance of various BERT-based models, including our fine-tuned BERT (Bidirectional Encoder Representations from Transformer) model, and compares them with pre-built models. To assess the generalization abilities of the models, we evaluated the models on benchmark datasets. Additionally, the models underwent evaluation on two synthetic datasets created using large language models. To ensure reproducibility, we have made our implementation publicly available at https://github.com/margitantal68/TextSubjectivity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8579; Price includes VAT (Japan)

Softcover Book: JPY 10724; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Utilizing BERT Pretrained Models with Various Fine-Tune Methods for Subjectivity Detection

Negation Detection in Medical Texts

Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT

Article Open access 21 April 2022

Notes

References

Antici, F., et al.: A corpus for sentence-level subjectivity detection on English news articles (2023)
Google Scholar
Chen, Q., Zhang, R., Zheng, Y., Mao, Y.: Dual contrastive learning: text classification via label-aware data augmentation (2022)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1423
Forman, N., Udvaros, J., Avornicului, M.S.: Chatgpt: a new study tool shaping the future for high school students. Int. J. Adv. Nat. Sci. Eng. Res. 7(4), 95–102 (2023)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha (2014). https://doi.org/10.3115/v1/D14-1181
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. Proc. AAAI Conf. Artif. Intell. 29(1), 2267–2273 (2015). https://doi.org/10.1609/aaai.v29i1.9513
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. (eds.) Advances in Neural Information Processing Systems. vol. 26. Curran Associates, Inc. (2013)
Google Scholar
Mosbach, M., Andriushchenko, M., Klakow, D.: On the stability of fine-tuning BERT: misconceptions, explanations, and strong baselines. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=nzpLWnVAyah
Nandi, R., Maiya, G., Kamath, P., Shekhar, S.: An empirical evaluation of word embedding models for subjectivity analysis tasks. In: 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–5 (2021). https://doi.org/10.1109/ICAECT49130.2021.9392437
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity. In: Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pp. 271–278 (2004)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha (2014). https://doi.org/10.3115/v1/D14-1162
Peters, M.E., et al.: Deep contextualized word representations. In: Walker, M., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans (2018). https://doi.org/10.18653/v1/N18-1202
Přibáň, P., Steinberger, J.: Czech dataset for cross-lingual subjectivity classification. In: Calzolari, N., et al. (eds.) Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 1381–1391. European Language Resources Association, Marseille (2022). https://aclanthology.org/2022.lrec-1.148
Pryzant, R., Martinez, R.D., Dass, N., Kurohashi, S., Jurafsky, D., Yang, D.: Automatically neutralizing subjective bias in text (2019)
Google Scholar
Radford, A., Narasimhan, K.: Improving language understanding by generative pre-training (2018). https://api.semanticscholar.org/CorpusID:49313245
Revina, A., Buza, K., Meister, V.G.: It ticket classification: the simpler, the better. IEEE Access 8, 193380–193395 (2020)
Article Google Scholar
Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 105–112 (2003). https://aclanthology.org/W03-1014
Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016)
Google Scholar
Zhang, T., Wu, F., Katiyar, A., Weinberger, K.Q., Artzi, Y.: Revisiting few-sample BERT fine-tuning. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=cO1IH43yUF
Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI 2015), pp. 4069–4076. AAAI Press (2015)
Google Scholar
Zhu, Y., et al.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 19–27 (2015). https://doi.org/10.1109/ICCV.2015.11

Download references

Author information

Authors and Affiliations

Department of Mathematics-Informatics, Sapientia Hungarian University of Transylvania, Targu Mures, Romania
Margit Antal, Krisztian Buza & Szilárd Nemes
Faculty of Finance and Accountancy, Budapest Business School, Budapest, Hungary
Krisztian Buza & Szilárd Nemes

Authors

Margit Antal
View author publications
You can also search for this author in PubMed Google Scholar
Krisztian Buza
View author publications
You can also search for this author in PubMed Google Scholar
Szilárd Nemes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Margit Antal .

Editor information

Editors and Affiliations

and Technology, Wrocław University of Science, Wrocław, Poland
Ngoc-Than Nguyen
University of Leipzig, Leipzig, Germany
Bogdan Franczyk
University of Leipzig, Leipzig, Germany
André Ludwig
Universidad Complutense de Madrid, Madrid, Spain
Manuel Nunez
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jan Treur
University of Münster, Münster, Germany
Gottfried Vossen
and Technology, Wrocław University of Science, Wrocław, Poland
Adrianna Kozierkiewicz

7 Appendix

We provide the input prompt utilized for the text classification task in the GPT-3.5-turbo model.

prompt = f“““You are an expert linguist. You should decide about a sentence whether is a subjective or an objective sentence. A sentence is subjective if its content is based on or influenced by personal feelings, tastes, or opinions. Otherwise, the sentence is objective. More precisely, a sentence is subjective if one or more of the following conditions apply: 1. expresses an explicit personal opinion from the author (e.g., speculations to draw conclusions); 2. includes sarcastic or ironic expressions; 3. gives exhortations of personal auspices; 4. contains discriminating or downgrading expressions; 5. contains rhetorical figures that convey the author’s opinion. Please classify the following text text in one of the two classes: (SUBJECTIVE, OBJECTIVE). Please, answer with a single word: OBJECTIVE or SUBJECTIVE”””

The instruction utilized for generating sentences is as follows:

prompt = f“““You are an expert linguist. Please, generate 10 objective sentences. A sentence is considered objective when it presents information in a factual and unbiased manner, without expressing personal opinions, emotions, or interpretations. Otherwise, the sentence is subjective.”””

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Antal, M., Buza, K., Nemes, S. (2024). Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments. In: Nguyen, NT., et al. Advances in Computational Collective Intelligence. ICCCI 2024. Communications in Computer and Information Science, vol 2165. Springer, Cham. https://doi.org/10.1007/978-3-031-70248-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-70248-8_19
Published: 08 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70247-1
Online ISBN: 978-3-031-70248-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Utilizing BERT Pretrained Models with Various Fine-Tune Methods for Subjectivity Detection

Negation Detection in Medical Texts

Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

7 Appendix

7 Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us