Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors | SpringerLink
Skip to main content

Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors

  • Conference paper
  • First Online:
Knowledge Engineering and Knowledge Management (EKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10024))

Included in the following conference series:

  • 2404 Accesses

Abstract

The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time \(t+1\), using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://academic.research.microsoft.com/.

  2. 2.

    Scopus, https://www.elsevier.com/solutions/scopus.

  3. 3.

    Notice that we are following a one step memory approach, further historical data could be used in future research.

  4. 4.

    The data generated in the evaluation are available on request at http://technologies.kmi.open.ac.uk/rexplore/ekaw2016/OF/.

References

  1. Ahmed, A., Xing, E., Timeline.: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. Uncert. Artif. Intell. (2010)

    Google Scholar 

  2. Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, vol. 2, pp. 1171–1177. AAAI Press (2011)

    Google Scholar 

  3. Bicer, V., Tran, T., Ma, Y., Studer, R.: TRM – learning dependencies between text and structure with topical relational models. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 1–16. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Ng, A.Y., Blei, D.M., Jordan, M.I.: Latent Dirichlet allocation. In. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Bolelli, L., Ertekin, Ş., Giles, C.L.: Topic and trend detection in text collections using latent Dirichlet allocation. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 776–780. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Bolelli, L., Ertekin, S., Zhou, D., Giles, C. L.: Finding topic trends in digital libraries. In: Proceedings of 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 69–72. ACM, New York (2009)

    Google Scholar 

  7. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)

    Google Scholar 

  8. Chen, S., Beeferman, D., Rosenfeld, R.: Evaluation metrics for language models (1998)

    Google Scholar 

  9. Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C.: No country for old members: user lifecycle and linguistic change in online communities. In: Proceedings of 22nd International Conference on World Wide Web, WWW 2013, pp. 307–318 (2013)

    Google Scholar 

  10. Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C. X.: Probabilistic topic models with biased propagation on heterogeneous information networks. In: Proceedings of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1271–1279. ACM, New York (2011)

    Google Scholar 

  11. Gohr, A., Hinneburg, A., Schult, R., Spiliopoulou, M.: Topic evolution in a stream of documents. In: SDM, pp. 859–872 (2009)

    Google Scholar 

  12. Griffiths, T., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101(Suppl. 1), 52285235 (2004)

    Google Scholar 

  13. He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., Giles, L.: Detecting topic evolution in scientific literature: how can citations help? In: Proceedings of 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 957–966. ACM, New York (2009)

    Google Scholar 

  14. Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Sig. Process. 35, 400–401 (1987)

    Article  Google Scholar 

  15. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  16. Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 198–207. ACM (2005)

    Google Scholar 

  17. Minka, T.: Estimating a Dirichlet distribution. Technical report (2003)

    Google Scholar 

  18. Monaghan, F., Bordea, G., Samp, K., Buitelaar, P.: Exploring your research: sprinkling some saffron on semantic web dog food. In: Semantic Web Challenge at the International Semantic Web Conference, vol. 117, pp. 420–435. Citeseer (2010)

    Google Scholar 

  19. Morinaga, S., Yamanishi, K.: Tracking dynamics of topic trends using a finite mixture model. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)

    Google Scholar 

  20. Osborne, F., Motta, E.: Klink-2: integrating multiple web sources to generate semantic topic networks. In: 14th International Semantic Web Conference (2015)

    Google Scholar 

  21. Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  22. Osborne, F., Salatino, A., Birukou, A., Mottam, E.: Automatic classification of springer nature proceedings with smart topic miner. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 383–399. Springer, Heidelberg (2016)

    Chapter  Google Scholar 

  23. Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012)

    Article  Google Scholar 

  24. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)

    Google Scholar 

  25. Tseng, Y.-H., Lin, Y.-I., Lee, Y.-Y., Hung, W.-C., Lee, C.-H.: A comparison of methods for detecting hot topics. Scientometrics 81(1), 73–90 (2009)

    Article  Google Scholar 

  26. Wang, H., Tudorache, T., Dou, D., Noy, N.F., Musen, M.A.: Analysis and prediction of user editing patterns in ontology development projects. J. Data Semant. 4(2), 117–132 (2015)

    Article  Google Scholar 

  27. Willett, P.: The porter stemming algorithm: then and now. Program 40(3), 219–223 (2006)

    Article  MathSciNet  Google Scholar 

  28. Zablith, F., Antoniou, G., d’Aquin, M., Flouris, G., Kondylakis, H., Motta, E., Plexousakis, D., Sabou, M.: Ontology evolution: a process-centric survey. Knowl. Eng. Rev. 30(01), 45–75 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Elsevier BV and Springer DE for providing us with access to their large repositories of scholarly data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Osborne .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Cano-Basave, A.E., Osborne, F., Salatino, A.A. (2016). Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49004-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49003-8

  • Online ISBN: 978-3-319-49004-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics