Towards Deep Semantic Analysis of Hashtags

Bansal, Piyush; Bansal, Romil; Varma, Vasudeva

doi:10.1007/978-3-319-16354-3_50

Piyush Bansal¹⁹,
Romil Bansal¹⁹ &
Vasudeva Varma¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9022))

Included in the following conference series:

European Conference on Information Retrieval

3979 Accesses

Abstract

Hashtags are semantico-syntactic constructs used across various social networking and microblogging platforms to enable users to start a topic specific discussion or classify a post into a desired category. Segmenting and linking the entities present within the hashtags could therefore help in better understanding and extraction of information shared across the social media. However, due to lack of space delimiters in the hashtags (e.g #nsavssnowden), the segmentation of hashtags into constituent entities (“NSA” and “Edward Snowden” in this case) is not a trivial task. Most of the current state-of-the-art social media analytics systems like Sentiment Analysis and Entity Linking tend to either ignore hashtags, or treat them as a single word. In this paper, we present a context aware approach to segment and link entities in the hashtags to a knowledge base (KB) entry, based on the context within the tweet. Our approach segments and links the entities in hashtags such that the coherence between hashtag semantics and the tweet is maximized. To the best of our knowledge, no existing study addresses the issue of linking entities in hashtags for extracting semantic information. We evaluate our method on two different datasets, and demonstrate the effectiveness of our technique in improving the overall entity linking in tweets via additional semantic information provided by segmenting and linking entities in a hashtag.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity

Entity linking of tweets based on dominant entity candidates

Article 26 June 2018

Implicit Entity Linking in Tweets

References

Weerkamp, W., Carter, S., Tsagkias, M.: How People use Twitter in Different Languages. In: Proceedings of Web Science (2011)
Google Scholar
Wang, K., Thrasher, C., Hsu, B.-J.P.: Web scale NLP: a case study on url word breaking. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011 (2011)
Google Scholar
Huang, C., Zhao, H.: Chinese word segmentation: A decade review. Journal of Chinese Information Processing 21(3), 8–20 (2007)
Google Scholar
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceedings of AAAI (2008)
Google Scholar
Kan, M.-Y., Hoang Oanh Nguyen, T.: Fast webpage classification using URL features. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM (2005)
Google Scholar
Srinivasan, S., Bhattacharya, S., Chakraborty, R.: Segmenting web-domains and hashtags using length specific models. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM (2012)
Google Scholar
Russell, S., Norvig, P.: Articial Intelligence: A Modern Approach. Prentice-Hall (2003)
Google Scholar
Gimpel, K., et al.: Part-of-speech tagging for twitter: Annotation, features, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2. Association for Computational Linguistics (2011)
Google Scholar
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1998)
Article Google Scholar
Cornolti, M., Ferragina, P., Ciaramita, M.: A framework for benchmarking entity-annotation systems. In: Proceedings of the 22nd International Conference on World Wide Web (2013)
Google Scholar
Meij, E., Weerkamp, W., de Rijke, M.: Adding Semantics to Microblog Posts. In: Proceedings of the 5th ACM International Conference on Web Search and Data Mining (2012)
Google Scholar
Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of 19th ACM Conference on Knowledge Management (2010)
Google Scholar
Bansal, R., Panem, S., Gupta, M., Varma, V.: EDIUM: Improving Entity Disambiguation via User Modeling. In: Proceedings of the 36th European Conference on Information Retrieval (2014)
Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2005)
Google Scholar
Cano Basave, A.E., Rizzo, G., Varga, A., Rowe, M., Stankovic, M., Dadzie, A.-S.: Making Sense of Microposts (#Microposts2014) Named Entity Extraction & Linking Challenge. In: 4th Workshop on Making Sense of Microposts (#Microposts2014) (2014)
Google Scholar
Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: A graph-based method. In: Proceedings of 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

International Institute of Information Technology Hyderabad, Telangana, India
Piyush Bansal, Romil Bansal & Vasudeva Varma

Authors

Piyush Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Romil Bansal
View author publications
You can also search for this author in PubMed Google Scholar
Vasudeva Varma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Vienna University of Technology, Institute of Software Technology and Interactive Systems, Favoritenstraße 9-11/188, 1040, Vienna, Austria
Allan Hanbury
Lumi, Semion Ltd., 111 Charterhouse Street, EC1M 6AW, London, UK
Gabriella Kazai
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstraße 9-11/188, 1040, Vienna, Austria
Andreas Rauber
Universität Duisburg-Essen, Lotharstraße 65, 47057, Duisburg, Germany
Norbert Fuhr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bansal, P., Bansal, R., Varma, V. (2015). Towards Deep Semantic Analysis of Hashtags. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds) Advances in Information Retrieval. ECIR 2015. Lecture Notes in Computer Science, vol 9022. Springer, Cham. https://doi.org/10.1007/978-3-319-16354-3_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-16354-3_50
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16353-6
Online ISBN: 978-3-319-16354-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics