Linguistic perspectives in deciphering citation function classification | Scientometrics Skip to main content
Log in

Linguistic perspectives in deciphering citation function classification

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Understanding citations within their context is a complex task in information science, critical for bibliometric analysis. The study of citation contexts and their types has been a central issue in recent work on citations. In this paper, we present an experiment on the semantic annotation of citation contexts using a rule-based approach. We processed articles from seven PLOS journals and performed semantic annotation of citation contexts based on linguistic resources we constructed. We built on previous work on verb form analysis, n-grams, and semantic category modeling in the form of a linguistic ontology. Based on our observations, we propose directions of work for the constitution of a semantically annotated corpora. The intermediate results obtained lead us to formulate hypotheses on the relation between the IMRaD structure and certain semantic categories. Furthermore, our results demonstrate the semantic richness of citation contexts and underscore the importance of access to full-text articles for ontology population in open science. The findings suggest that semantic categories vary across disciplines and rhetorical structures, necessitating further exploration with larger and more diverse datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Abu-Jbara, A., Ezra, J., & Radev, D. (2013). Purpose and polarity of citation: Towards NLP-based bibliometrics. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 596–606). https://aclanthology.org/N13-1067

  • Agarwal, S., Choubey, L., & Yu, H. (2010). Automatically classifying the role of citations in biomedical articles. In Amia annual symposium proceedings (Vol. 2010, p. 11).

  • Bertin, M., & Atanassova, I. (2015, 3). Factorial correspondence analysis applied to citation contexts. In Conference: 2nd international workshop on bibliometric enhanced information retrieval (BIR2015) at the 37th European conference on information retrieval (ECIR-2015) (Vol. 1344, pp. 22–29), Vienne, Austria. Retrieved from https://hal.archives-ouvertes.fr/hal-01940804

  • Bertin, M., & Atanassova, I. (2016). Multiple in-text reference aggregation phenomenon. In Proceedings of the 3rd workshop on bibliometric-enhanced information retrieval co-located with 38th European conference on information retrieval (ECIR 2016) (pp. 14–22). Padua, Italy.

  • Bertin, M., & Atanassova, I. (2023a, 7). Contextual analysis of citations in context using rule-based approaches. best soups are made in old pots. In 19th international conference of the international society for scientometrics and informetrics (ISSI, 2023), Indiana.

  • Bertin, M., & Atanassova, I. (2023, November). Semantic annotation of PLoS journal citation contexts: Zenodo. Retrieved from https://doi.org/10.5281/zenodo.10140552

  • Bertin, M., Atanassova, I., & Desclés, J. -P. (2009, 5). Automatic analysis of author judgment in scientific articles based on semantic annotation. In 22nd International Florida artificial intelligence research society conference, (FLAIRS22). Sanibel Island, Florida, USA, AAAI Press. Retrieved from https://hal.archives-ouvertes.fr/hal-01885113

  • Bertin, M., Atanassova, I., Gingras, Y., & Larivière, V. (2016). The invariant distribution of references in scientific articles. Journal of the Association for Information Science and Technology, 67(1), 164–177. https://doi.org/10.1002/asi.23367

    Article  Google Scholar 

  • Bertin, M., Jonin, P., Armetta, F., & Atanassova, I. (2019, 9). Determining citation blocks using end-to-end neural coreference resolution model. In 17th international conference of the international society for scientometrics and informetrics. Rome, Italie. Retrieved from https://hal.archives-ouvertes.fr/hal-01953961

  • Bertin, M., Larivière, V., Gingras, Y., & Atanassova, I. (2014). The linguistic context of citations. 10th iteration (2014): The future of science mapping, places & spaces: Mapping science. Indiana, United States. Retrieved from https://hal.science/hal-01954672

  • Bordignon, F. (2020). Self-correction of science: A comparative study of negative citations and post-publication peer review. Scientometrics, 124(2), 1225–1239.

    Article  Google Scholar 

  • Bornmann, L., & Daniel, H.-D. (2008). Functional use of frequently and infrequently cited articles in citing publications: A content analysis of citations to articles with low and high citation counts. European Science Editing, 34(2), 35–38

  • Boyack, K. W., van Eck, N. J., Colavizza, G., & Waltman, L. (2018). Characterizing intext citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73.

    Article  Google Scholar 

  • Catalini, C., Lacetera, N., & Oettl, A. (2015). The incidence and role of negative citations in science. Proceedings of the National Academy of Sciences, 112(45), 13823–13826.

    Article  Google Scholar 

  • Cohan, A., Ammar, W., Van Zuylen, M., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. arXiv preprint http://arxiv.org/abs/1904.01608

  • Cronin, B. (1981). The need for a theory of citing. Journal of Documentation, 37(1), 16–24. https://doi.org/10.1108/eb026703

    Article  Google Scholar 

  • Desclés, J.-P. (1997). Systèmes d’exploration contextuelle. Co-texte et calcul du sens, 1997, 215–232.

    Google Scholar 

  • Desclés, J.-P., Jouis, C., Oh, H.-G., & Reppert, D. (1991). Exploration contextuelle et sémantique: un système expert qui trouve les valeurs sémantiques des temps de l’indicatif dans un texte. Knowledge Modeling and Expertise Transfer, 1, 371–400.

    Google Scholar 

  • Dong, C., & Schäfer, U. (2011). Ensemble-style self-training on citation classification. In Proceedings of 5th international joint conference on natural language processing (pp. 623–631).

  • Ferrod, R., Di Caro, L., & Schifanella, C. (2021). Structured semantic modeling of scientific citation intents. In The semantic web: 18th international conference, ESWC 2021, virtual event, June 6–10, 2021, proceedings 18 (pp. 461–476).

  • Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378.

    Article  Google Scholar 

  • Hernandez-Alvarez, M., Soriano, J. M. G., & Martínez-Barco, P. (2017). Citation function, polarity and influence classification. Natural Language Engineering, 23(4), 561–588.

    Article  Google Scholar 

  • Jiang, X., & Chen, J. (2023). Contextualised segment-wise citation function classification. Scientometrics, 128(9), 5117–5158.

    Article  Google Scholar 

  • Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of COLING 2012 (pp. 1343– 1358).

  • Jurgens, D., Kumar, S., Hoover, R., McFarland, D., & Jurafsky, D. (2018). Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, 6, 391–406.

    Article  Google Scholar 

  • Kunnath, S. N., Herrmannova, D., Pride, D., & Knoth, P. (2021). A meta-analysis of semantic classification of citations. Quantitative Science Studies, 2(4), 1170–1215.

    Article  Google Scholar 

  • Kunnath, S.N., Pride, D., Gyawali, B., & Knoth, P. (2020). Overview of the 2020 WOSP 3C citation context classification task. In Proceedings of the 8th International Workshop on Mining Scientific Publications (pp. 75–83).

  • Landis, J. R., & Koch, G. G. (1977). An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 1, 363–374.

    Article  Google Scholar 

  • Lauscher, A., Ko, B., Kuehl, B., Johnson, S., Jurgens, D., Cohan, A., & Lo, K. (2021). Multicite: Modeling realistic citations requires moving beyond the single-sentence single-label setting. arXiv preprint http://arxiv.org/abs/2107.00414

  • Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. In Proceedings of the international conference recent advances in natural language processing RANLP 2013 (pp. 402–407).

  • Liu, X., Zhang, J., & Guo, C. (2013). Full-text citation analysis: A new method to enhance scholarly networks. Journal of the American Society for Information Science and Technology, 64(9), 1852–1863.

    Article  Google Scholar 

  • Maricić, S., Spaventi, J., Pavicić, L., & Pifat-Mrzljak, G. (1998). Citation context versus the frequency counts of citation histories. Journal of the American Society for Information Science, 49(6), 530–540.

  • Meyers, A. (2013). Contrasting and corroborating citations in journal articles. In Proceedings of the international conference recent advances in natural language processing RANLP 2013 (pp. 460–466).

  • Peroni, S., & Shotton, D. (2012). FaBiO and CiTO: Ontologies for describing bibliographic resources and citations. Journal of Web Semantics, 17, 33–43.

    Article  Google Scholar 

  • Pride, D., & Knoth, P. (2020). An authoritative approach to citation classification. In Proceedings of the ACM/IEEE joint conference on digital libraries in 2020 (pp. 337–340).

  • Saier, T., & Färber, M. (2020). unarxive: A large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata. Scientometrics, 125(3), 3085–3108.

    Article  Google Scholar 

  • Shahid, A., Afzal, M. T., Saleem, M. Q., Idrees, M., & Omer, M. K. (2021). Extension of direct citation model using in-text citations. Computers, Materials & Continua, 66, 3.

  • Su, X., Prasad, A., Kan, M.-Y., & Sugiyama, K. (2019). Neural multi-task learning for citation function and provenance. In 2019 ACM/IEEE joint conference on digital libraries (JCDL) (pp. 394–395).

  • Tahamtan, I., & Bornmann, L. (2022). The social systems citation theory (ssct): A proposal to use the social systems theory for conceptualizing publications and their citations links. Profesional de la información, 31, 4.

  • Taylor, R., Kardas, M., Cucurull, G., Scialom, T., Hartshorn, A., Saravia, E., & Stojnic, R. (2022). Galactica: A large language model for science. arXiv preprint http://arxiv.org/abs/2211.09085

  • Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110).

  • Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S.-U., & Haddawy, P. (2019). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896.

    Article  Google Scholar 

  • Zhang, Y., Wang, Y., Sheng, Q.Z., Mahmood, A., Emma Zhang, W., & Zhao, R. (2021). TDM-CFC: Towards document-level multi-label citation function classification. Web information systems engineering–wise 2021: 22nd international conference on web information systems engineering, WISE 2021, Melbourne, VIC, Australia, October 26–29, 2021, proceedings, part ii 22 (pp. 363–376).

  • Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Bertin.

Ethics declarations

Conflict of interest

This work was supported by ANR-20-CE38-0003-01. The authors declare that they have no conflict of interest. This paper is an extended version of Bertin and Atanassova (2023a), presented at the 19th International Conference of the International Society for Scientometrics and Informetrics, held from July 2–5, 2023, in Bloomington, IN, USA.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bertin, M., Atanassova, I. Linguistic perspectives in deciphering citation function classification. Scientometrics 129, 6301–6313 (2024). https://doi.org/10.1007/s11192-024-05082-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-024-05082-4

Keywords