Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments | SpringerLink
Skip to main content

Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments

  • Conference paper
  • First Online:
Advances in Computational Collective Intelligence (ICCCI 2024)

Abstract

Text subjectivity is an important research topic due to its applications in various domains such as sentiment analysis, opinion mining, social media monitoring, clinical research and patient feedback analysis. While rule-based approaches dominated this field at the beginning of the 21st century, contemporary works rely on transformers, a specific neural network architecture designed for language modeling. This paper explores the performance of various BERT-based models, including our fine-tuned BERT (Bidirectional Encoder Representations from Transformer) model, and compares them with pre-built models. To assess the generalization abilities of the models, we evaluated the models on benchmark datasets. Additionally, the models underwent evaluation on two synthetic datasets created using large language models. To ensure reproducibility, we have made our implementation publicly available at https://github.com/margitantal68/TextSubjectivity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 8579
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10724
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://bard.google.com/chat.

  2. 2.

    https://www.ai21.com/.

  3. 3.

    https://pytorch.org.

  4. 4.

    https://huggingface.co/cffl/bert-base-styleclassification-subjective-neutral.

  5. 5.

    https://bard.google.com/chat.

  6. 6.

    https://docs.ai21.com/docs/jurassic-2-models.

References

  1. Antici, F., et al.: A corpus for sentence-level subjectivity detection on English news articles (2023)

    Google Scholar 

  2. Chen, Q., Zhang, R., Zheng, Y., Mao, Y.: Dual contrastive learning: text classification via label-aware data augmentation (2022)

    Google Scholar 

  3. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1423

  4. Forman, N., Udvaros, J., Avornicului, M.S.: Chatgpt: a new study tool shaping the future for high school students. Int. J. Adv. Nat. Sci. Eng. Res. 7(4), 95–102 (2023)

    Google Scholar 

  5. Kim, Y.: Convolutional neural networks for sentence classification. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha (2014). https://doi.org/10.3115/v1/D14-1181

  6. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. Proc. AAAI Conf. Artif. Intell. 29(1), 2267–2273 (2015). https://doi.org/10.1609/aaai.v29i1.9513

  7. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. (eds.) Advances in Neural Information Processing Systems. vol. 26. Curran Associates, Inc. (2013)

    Google Scholar 

  8. Mosbach, M., Andriushchenko, M., Klakow, D.: On the stability of fine-tuning BERT: misconceptions, explanations, and strong baselines. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=nzpLWnVAyah

  9. Nandi, R., Maiya, G., Kamath, P., Shekhar, S.: An empirical evaluation of word embedding models for subjectivity analysis tasks. In: 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–5 (2021). https://doi.org/10.1109/ICAECT49130.2021.9392437

  10. Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity. In: Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL), pp. 271–278 (2004)

    Google Scholar 

  11. Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha (2014). https://doi.org/10.3115/v1/D14-1162

  12. Peters, M.E., et al.: Deep contextualized word representations. In: Walker, M., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans (2018). https://doi.org/10.18653/v1/N18-1202

  13. Přibáň, P., Steinberger, J.: Czech dataset for cross-lingual subjectivity classification. In: Calzolari, N., et al. (eds.) Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 1381–1391. European Language Resources Association, Marseille (2022). https://aclanthology.org/2022.lrec-1.148

  14. Pryzant, R., Martinez, R.D., Dass, N., Kurohashi, S., Jurafsky, D., Yang, D.: Automatically neutralizing subjective bias in text (2019)

    Google Scholar 

  15. Radford, A., Narasimhan, K.: Improving language understanding by generative pre-training (2018). https://api.semanticscholar.org/CorpusID:49313245

  16. Revina, A., Buza, K., Meister, V.G.: It ticket classification: the simpler, the better. IEEE Access 8, 193380–193395 (2020)

    Article  Google Scholar 

  17. Riloff, E., Wiebe, J.: Learning extraction patterns for subjective expressions. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 105–112 (2003). https://aclanthology.org/W03-1014

  18. Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation (2016)

    Google Scholar 

  19. Zhang, T., Wu, F., Katiyar, A., Weinberger, K.Q., Artzi, Y.: Revisiting few-sample BERT fine-tuning. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=cO1IH43yUF

  20. Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI 2015), pp. 4069–4076. AAAI Press (2015)

    Google Scholar 

  21. Zhu, Y., et al.: Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 19–27 (2015). https://doi.org/10.1109/ICCV.2015.11

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Margit Antal .

Editor information

Editors and Affiliations

7 Appendix

7 Appendix

We provide the input prompt utilized for the text classification task in the GPT-3.5-turbo model.

prompt = f“““You are an expert linguist. You should decide about a sentence whether is a subjective or an objective sentence. A sentence is subjective if its content is based on or influenced by personal feelings, tastes, or opinions. Otherwise, the sentence is objective. More precisely, a sentence is subjective if one or more of the following conditions apply: 1. expresses an explicit personal opinion from the author (e.g., speculations to draw conclusions); 2. includes sarcastic or ironic expressions; 3. gives exhortations of personal auspices; 4. contains discriminating or downgrading expressions; 5. contains rhetorical figures that convey the author’s opinion. Please classify the following text text in one of the two classes: (SUBJECTIVE, OBJECTIVE). Please, answer with a single word: OBJECTIVE or SUBJECTIVE”””

The instruction utilized for generating sentences is as follows:

prompt = f“““You are an expert linguist. Please, generate 10 objective sentences. A sentence is considered objective when it presents information in a factual and unbiased manner, without expressing personal opinions, emotions, or interpretations. Otherwise, the sentence is subjective.”””

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Antal, M., Buza, K., Nemes, S. (2024). Advancements in Text Subjectivity Analysis: From Simple Approaches to BERT-Based Models and Generalization Assessments. In: Nguyen, NT., et al. Advances in Computational Collective Intelligence. ICCCI 2024. Communications in Computer and Information Science, vol 2165. Springer, Cham. https://doi.org/10.1007/978-3-031-70248-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70248-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70247-1

  • Online ISBN: 978-3-031-70248-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics