Abstract
This paper presents four different mechanisms for ontology learning from Twitter data. The learning process involves the identification of entities and relations from a specified Twitter data set, which is then used to produce an ontology. The initial two methods considered, the Stanford and GATE based ontology learning frameworks, are both semi-automated methods for identifying the relations in the desired ontology. Although the two frameworks effectively create an ontology supported knowledge resource, the frameworks feature a particular disadvantage; the time-consuming and cumbersome task of manually annotating a relation extraction training data sets. As a result two other ontology learning frameworks are proposed, one using regular expressions which reduces the required resource, and one that combines Shortest Path Dependency parsing and Word Mover’s Distance to fully automate the process of creating relation extraction training data. All four are analysed and discussed in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, W., Demaerini, G., Bath, P.A.: Topics discussed on twitter at the beginning of the 2014 ebola epidemic in united states. In: iConference 2017 Proceedings (2017)
Alajlan., S., Coenen., F., Konev., B., Mandya., A.: Ontology learning from twitter data. In: Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, pp. 94–103. INSTICC, SciTePress (2019)
Arias, M., Arratia, A., Xuriguera, R.: Forecasting with twitter data. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 1–24 (2014)
Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731. Association for Computational Linguistics (2005)
Carlson, A., Betteridge, J., Wang, R.C., Hruschka, E.R., Mitchell, T.M.: Coupled semi-supervised learning for information extraction. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining, p. 101. ACM (2010)
Chunxiao, W., et al.: Customizing an information extraction system to a new domain. In: Regulatory Peptides, vol. 141, pp. 35–43. Association for Computational Linguistics (2007)
Cunningham, H.: Gate, a general architecture for text engineering. Comput. Humanit. 36(2), 223–254 (2002)
Erkan, G., Ozgur, A., Radev, D.R.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Exner, P., Nugues, P.: Entity extraction: from unstructured text to dbpedia RDF triples. In: The Web of Linked Entities Workshop (WoLE 2012), pp. 58–69. CEUR (2012)
Fellbaum, C.: Wordnet. In: Theory and Applications of Ontology: Computer Applications, pp. 231–243. Springer, Dordrecht (2010). https://doi.org/10.1007/978-90-481-8847-5_10
Cunningham H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine (Second Edition). Department of Computer Science, University of Sheffield (2000)
Harlow, C.: Data Munging Tools in Preparation for RDF: catmandu and LODRefine. The Code4Lib Journal 30(30), 1–30 (2015)
Iqbal, R., Murad, M.A.A., Mustapha, A., Sharef, N.M.: An analysis of ontology engineering methodologies: a literature review. Res. J. Appl. Sci. Eng. Technol. 6(16), 2993–3000 (2013)
Kavalec, M., Svaték, V.: A study on automated relation labelling in ontology learning. Ontology Learning from Text: Methods, Evaluation and Applications, pp. 44–58 (2005)
Klusch, M., Kapahnke, P., Schulte, S., Lecue, F., Bernstein, A.: Semantic web service search: a brief survey. KI - Künstliche Intelligenz 30(2), 139–147 (2015). https://doi.org/10.1007/s13218-015-0415-7
Kübler, S., McDonald, R., Nivre, J.: Dependency parsing. Synthesis Lect. Human Lang. Technol. 1(1), 1–127 (2009)
Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)
Li, M., Du, X.Y., Wang, S.: Learning ontology from relational database. In: 2005 International Conference on Machine Learning and Cybernetics. vol. 6, pp. 3410–3415. IEEE (2005)
Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intell. Syst. 16(2), 72–79 (2001)
Mahmoud, N., Elbeh, H., Abdlkader, H.M.: Ontology learning based on word embeddings for text big data extraction. In: 2018 14th International Computer Engineering Conference (ICENCO), pp. 183–188. IEEE (2018)
Mazari, A.C., Aliane, H., Alimazighi, Z.: Automatic construction of ontology from arabic texts. In: ICWIT, pp. 193–202 (2012)
McCrae, J., Fellbaum, C., Cimiano, P.: Publishing and linking wordnet using lemon and rdf. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Prud’Hommeaux, E., Seaborne, A., Prud, E., Laboratories, H.p.: SPARQL Query Language for RDF. W3C Working Draftd, pp. 1–95 (2008)
Qian, L., Zhou, G.: Tree kernel-based protein-protein interaction extraction from biomedical literature. J. Biomed. Inform. 45(3), 535–543 (2012)
Riedel, S., Mccallum, A.: Relation Extraction with Matrix Factorization. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)
Roth, D., Yih, W.t.: Global Inference for Entity and Relation Identification via a Linear Programming Formulation. Introduction to Statistical Relational Learning, pp. 553–580 (2019)
Stieglitz, S., Dang-Xuan, L.: Social media and political communication: a social media analytics framework. Social Network Anal. Mining 3(4), 1277–1291 (2012). https://doi.org/10.1007/s13278-012-0079-3
Takamatsu, S., Sato, I., Nakagawa, H.: Reducing Wrong Labels in Distant Supervision for Relation Extraction. In: ACL, pp. 721–729. Association for Computational Linguistics (2012)
Tanwar, M., Duggal, R., Khatri, S.K.: Unravelling unstructured data: A wealth of information in big data. In: 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pp. 1–6. IEEE (2015)
Gruber, T.: A translation approach to portable ontology specifications. Knowl. Acquisition 5(2), 199–220 (1993)
Xiang, Z., Gretzel, U.: Role of social media in online travel information search. Tourism Management 31(2), 179–188 (2010)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: ACM Sigmod Record. vol. 25, pp. 103–114. ACM (1996)
Zhou, L.: Ontology learning: state of the art and open issues. Inf. Technol. Manage. 8(3), 241–252 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Alajlan, S., Coenen, F., Mandya, A. (2020). From Semi-automated to Automated Methods of Ontology Learning from Twitter Data. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2019. Communications in Computer and Information Science, vol 1297. Springer, Cham. https://doi.org/10.1007/978-3-030-66196-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-66196-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66195-3
Online ISBN: 978-3-030-66196-0
eBook Packages: Computer ScienceComputer Science (R0)