Abstract
Automatic identification of intended tag meanings is a challenge in large image collections where human authors assign tags inspired by emotional or professional motivations. Algorithms for automatic tag disambiguation need “golden” collections of manually created tags to establish baselines for accuracy assessment. Here we show how to use the MIRFLICKR-25000 collection to evaluate the performance of our algorithm for tag sense disambiguation which identifies meanings of image tags based on WordNet or Wikipedia. We present three different types of observations on the disambiguated tags: (i) accuracy evaluation, (ii) evaluation of the semantic similarity of the individual tags with the image category and (iii) the semantic similarity of an image tagset to the image category, using different word embedding models for the latter two. We show how word embeddings create a specific baseline so the results can be compared. The accuracy we achieve is 78.6%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kanishcheva, O., Angelova, G.: About sense disambiguation of image tags in large annotated image collections. In: Margenov, S., Angelova, G., Agre, G. (eds.) Innovative Approaches and Solutions in Advanced Intelligent Systems. SCI, vol. 648, pp. 133–149. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32207-0_9
Huiskes, M., Lew, M.: The MIR Flickr Retrieval Evaluation. In: Proceedings of ACM International Conference on Multimedia IR (MIR 2008), pp. 39–43. ACM, New York (2008)
WordNet, a Lexical Database for English. https://wordnet.princeton.edu/. Accessed 23 June 2018
Ferraro1, F., Mostafazadeh, N., Huang, T.-H., Vanderwende, L., Devlin, J., Galley, M., Mitchell, M.: A survey of current datasets for vision and language research. In: Proceedings of the 2015 EMNLP Conference, Lisbon, Portugal, pp. 207–213 (2015)
Saenko, K.: Image sense disambiguation: a multimodal approach. PhD thesis MIT http://hdl.handle.net/1721.1/54651. Accessed 11 May 2018
Saenko, K., Darrell, T.: Unsupervised learning of visual sense models for polysemous words. In: Advances in Neural Information Processing Systems (NIPS 2008), Vancouver, Canada, vol. 21, pp. 1393–1400 (2009)
Lee, K., Kim, H., Shin, H., Kim, H.: Tag sense disambiguation for clarifying the vocabulary of social tags. In: International Conference on Computational Science and Engineering, Vancouver, Canada, pp. 729–734 (2009)
James, N., Hudelot, C.: Towards semantic image annotation with keyword disambiguation using semantic and visual knowledge. In: the IJCAI-2009 Workshop on Cross-Media Information Access and Mining. http://liir.cs.kuleuven.be/conferences/CIAM2009/ciam2009_6.pdf. Accessed 24 Apr 2018
Legesse, M., Gianini, G., Teferi, D.: Selecting feature-words in tag sense disambiguation based on their shapley value. In: Proceedings 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, pp. 236–240 (2016)
May, W., Fidler, S., Fazly, A., Dickinson, S., Stevenson, S.: Unsupervised disambiguation of image captions. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (SemEval 2012), Montréal, Canada, vol. 1, pp. 85–89 (2012)
Iacobacci, I., Pilehvar, M.T., Navigli, R.: SENSEMBED: learning sense embeddings for word and relational similarity. In: Proceedings of the 53rd Annual Meeting of ACL and the 7th International Joint Conference on NLP, Beijing, China, pp. 95–105 (2015)
Raiman, J., Raiman, O.: DeepType: multilingual entity linking by neural type system evolution. In: Proceedings 32th AAAI Conference on AI (AAAI-2018), February 2018, New Orleans, Louisiana, USA (2018). https://arxiv.org/abs/1802.01021. Accessed 24 Apr 2018
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS 2013), Nevada, USA, vol. 2, pp. 3111–3119 (2013)
Simov, K., Osenova, P., Popov, A.: Comparison of word embeddings from different knowledge graphs. In: Gracia, J., Bond, F., McCrae, John P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 213–221. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59888-8_19
Popov, A.: Word sense disambiguation with recurrent neural networks. In: Kovatchev, V., et al. (eds.) Proceedings of the Student Research Workshop Associated with RANLP 2017, Varna, Bulgaria, pp. 25–34 (2017)
Camacho-Collados, J., Taher Pilehvar, M.: From Word to Sense embeddings: a survey on vector representations of meaning. Submitted to JAIR, arXiv:1805.04032, May 2018. http://adsabs.harvard.edu/abs/2018arXiv180504032C. Accessed 22 June 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kanishcheva, O., Nikolova, I., Angelova, G. (2018). Evaluation of Automatic Tag Sense Disambiguation Using the MIRFLICKR Image Collection. In: Agre, G., van Genabith, J., Declerck, T. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2018. Lecture Notes in Computer Science(), vol 11089. Springer, Cham. https://doi.org/10.1007/978-3-319-99344-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-99344-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99343-0
Online ISBN: 978-3-319-99344-7
eBook Packages: Computer ScienceComputer Science (R0)