Abstract
The increasing popularity of social media services has led to more and more people using Twitter. There are millions of tweets with a high amount of noisy data that propagate daily on the Internet. Twitter acts as a source of information for events and breaking news. However, it is very challenging for any person to extract useful information related to important events manually, from the end- less stream of tweets. Hence, it is desired to automate the whole process of event detection, so that important events can be identified in real-time from a stream of tweets, as early as possible, after the actual happening. Most of the existing approaches are more focussed on “What happened”. To define any event, answers of “When” and “Where” are also required. To handle emergency events, location and time parameters play a very important role. This article proposes a faster location based event detection approach without compromising accuracy, which automatically extracts separate clusters concerning local or global events from real-time streaming data. The proposed approach consists of four major steps. In the first step, a new dynamic weighting scheme named Conditional Term Frequency-Average Inverse Window Frequency (CTF-AIWF) based on TF-IDF is proposed to capture emerging keywords from the temporal dynamics of data. Next, a new clustering algorithm named Edge Significance based Louvain Algorithm (ESBLA) is proposed to group the same event keywords. This clustering helps in improving the run-time performance up to 50% while maintaining the quality performance (F1-score) comparable to the baseline models. In the third step, a new content-based location detection technique is proposed to detect the location of the event. This technique is able to handle various issues like use of informal text, short form of a text, and misspelled keywords of microblogging data. Finally, Google Map is used to visualize the events in happening locations. This step makes the decision faster regarding the detected events. For the experimentation, tweets are collected in real-time and stored in MongoDB NoSQL database for processing.
Similar content being viewed by others
Data availability
Data will be available on reasonable request.
Notes
References
Abdelhaq H, Gertz M, Armiti A (2017) Efficient online extraction of keywords for localized events in twitter. GeoInformatica 21(2):365–388
Ahmed S, Jaidka K, Cho J (2016) The 2014 indian elections on twitter: a compari- son of campaign strategies of political parties. Telematics Inform 33(4):1071–1087
Akhgari Z, Malekimajd M, Rahmani H (2022) Sem-ted: semantic twitter event detection and adapting with news stories. In: 2022 8th international conference on web research (ICWR). IEEE, pp 61–69
Akhgari Z, Malekimajd M, Rahmani H (2022) Tedgram: twitter event detec- tion using graphbased methods. In: 2022 8th international conference on web research (ICWR). IEEE, pp 16–23
Allan J (2002) Introduction to topic detection and tracking. In Topic detection and tracking: Event-based information organization. Springer US, Boston, MA, pp 1–16
Alomari E, Katib I, Albeshri A, Mehmood R (2021) Covid-19: detecting govern- ment pandemic measures and public concerns from twitter arabic data using distributed machine learning. Int J Environ Res Public Health 18(1):282
Bhuvaneswari A, Jayanthi R, Meena AL (2021) Improving crisis event detection rate in online social networks twitter stream using apache spark. J Phys Conf Ser 1950:012077
Blei, DM, Lafferty, JD (2006) Dynamic topic models. In: proceedings of the 23rd international conference on machine learning, pp. 113–120. ACM
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008
Choi H-J, Park CH (2019) Emerging topic detection in twitter stream based on high utility pattern mining. Expert Syst Appl 115:27–36
Dhiman A, Toshniwal D (2020) An approximate model for event detection from twitter data. IEEE Access 8:122168–122184
Fang Y, Gao J, Liu Z, Huang C (2020) Detecting cyber threat event from twitter using idcnn and bilstm. Appl Sci 10(17):5922
Fedoryszak, M, Frederick, B, Rajaram, V, Zhong, C (2019) Real-time event detection on social data streams. In: proceedings of the 25th ACM SIGKDD international conference on Knowledge Discovery & Data Mining, pp. 2774–2782. ACM
Feng, X, Zhang, S, Liang, W, Liu, J (2015) Efficient location-based event detection in social text streams. In: International conference on intelligent science and big data engineering, pp. 213–222. Springer
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Gaglio S, Re GL, Morana M (2016) A framework for real-time twitter data analysis. Comput Commun 73:236–242
Ghaemi Z, Farnaghi M (2019) A varied density-based clustering approach for event detection from heterogeneous twitter data. ISPRS Int J Geo- Inf 8(2):82
Giridhar, P, Abdelzaher, T., George, J, Kaplan, L (2015) On quality of event local- ization from social network feeds. In: Pervasive computing and communication workshops (PerCom workshops), 2015 IEEE international conference on, pp. 75–80. IEEE
Girish, K, Moni, J, Roy, JG, Afreed, C, Harikrishnan, S, Kumar, GG (2022) Extreme event detection and management using twitter data analysis. In: 2022 international conference on decision aid sciences and applications (DASA), pp. 917–921. IEEE
Guille, A, Favre, C (2014) Mention-anomaly-based event detection and tracking in twitter. In: Advances in social networks analysis and mining (ASONAM), 2014 IEEE/ACM international conference on, pp. 375–382. IEEE
Hasan M, Orgun MA, Schwitter R (2016) Twitternews: real time event detection from the twitter data stream. PeerJ PrePrints 4:2297–2291
Hoffman, M, Bach, FR, Blei, DM (2010) Online learning for latent dirichlet allo- cation. In: Advances in Neural Information Processing Systems, pp. 856–864
Hossny, AH, Mitchell, L (2018) Event detection in twitter: a keyword volume approach. In: 2018 IEEE international conference on data mining workshops (ICDMW), pp. 1200–1208. IEEE
Hu, M, Liu, S, Wei, F, Wu, Y, Stasko, J, Ma, K-L (2012) Breaking news on twit- ter. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2751–2754
Ifrim, G, Shi, B, Brigadir, I (2014) Event detection in twitter using aggressive filtering and hierarchical tweet clustering. In: Second workshop on social news on the web (SNOW), Seoul, Korea, 8 April 2014. ACM
Janjua NK, Nawaz F, Prior DD (2023) A fuzzy supply chain risk assessment approach using real-time disruption event data from twitter. Enterp Inf Syst 17(4):1959652
Kamoji S, Kalla M (2023) Effective flood prediction model based on twitter text and image analysis using bmlp and sdae-hhnn. Eng Appl Artif Intell 123:106365
Karimi S, Shakery A, Verma RM (2023) Enhancement of twitter event detection using news streams. Nat Lang Eng 29(2):181–200
Khan HU, Nasir S, Nasim K, Shabbir D, Mahmood A (2021) Twitter trends: a ranking algorithm analysis on real time data. Expert Syst Appl 164:113990
Li, R, Lei, KH, Khadiwala, R, Chang, KC-C (2012) Tedas: a twitter-based event detection and analysis system. In: Data engineering (icde), 2012 Ieee 28th international conference on, pp. 1273–1276. IEEE
Li, C, Sun, A, Datta, A (2012) Twevent: segment-based event detection from tweets. In: proceedings of the 21st ACM international conference on information and knowledge management, pp. 155–164. ACM
McMinn, AJ, Moshfeghi, Y, Jose, JM (2013) Building a large-scale corpus for evalu- ating event detection on twitter. In: proceedings of the 22nd ACM international conference on Information & Knowledge Management, pp. 409–418. ACM
Mehrotra, R, Sanner, S, Buntine, W, Xie, L (2013) Improving lda topic models for microblogs via tweet pooling and automatic labeling. In: proceedings of the 36th international ACM SIGIR conference on Research and Development in information retrieval, pp. 889–892. ACM
Mojiri MM, Ravanmehr R (2020) Event detection in twitter using multi timing chained windows. Comput Inf 39(6):1336–1359
Newman ME (2004) Detecting community structure in networks. The Eur Phys J B 38(2):321–330
Nguyen DT, Jung JE (2017) Real-time event detection for online behavioral analysis of big social data. Futur Gener Comput Syst 66:137–145
Noori, MAR, Mehra, R (2020) Fire emergency detection from twitter using super- vised principal. In: 2020 IEEE 15th international conference on industrial and information systems (ICIIS), pp. 403–408. IEEE
Osborne, M, Petrovic, S, McCreadie, R, Macdonald, C, Ounis, I (2012) Bieber no more: First story detection using twitter and wikipedia. In: SIGIR 2012 Workshop on Time-aware Information Access
Ozdikis O, O˘guztüzün, H., Karagoz, P. (2016) Evidential estimation of event loca- tions in microblogs using the dempster–Shafer theory. Inf Process Manag 52(6):1227–1246
Pandya, A, Oussalah, M, Kostakos, P, Fatima, U (2020) Mated: metadata-assisted twitter event detection system. In: information processing and Management of Uncertainty in knowledge-based systems: 18th international conference, IPMU 2020, Lisbon, Portugal, June 15–19, 2020, proceedings, part I 18, pp. 402–414. Springer
Paul NR, Sahoo D, Balabantaray RC (2023) Classification of crisis-related data on twitter using a deep learning-based framework. Multimed Tools Appl 82(6):8921–8941
Petrovíc S, Osborne, M, Lavrenko, V (2010) Streaming first story detection with application to twitter. In: Human Language Technologies: The 2010 Annual Con- ference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189. Assoc Comput Linguist
Qiu, X, Zou, Q, Richard Shi, C (2021) Single-pass on-line event detection in twit- ter streams. In: 2021 13th International Conference on Machine Learning and Computing, pp. 522–529
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Rezaei, Z, Eslami, B, Amini, MA, Eslami, M (2022) Event detection in twitter by deep learning classification and multi label clustering virtual backbone formation. Evol Intel, 1–15
Said, N, Ahmad, K, Gul, A, Ahmad, N, Al-Fuqaha, A (2020) Floods detection in twitter text and images. arXiv preprint arXiv:2011.14943
Sakaki T, Okazaki M, Matsuo Y (2013) Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans Knowl Data Eng 25(4):919–931
Salza, D, Arnaudo, E, Blanco, G, Rossi, C (2022) A’glocal’approach for real-time emergency event detection in twitter. In: ISCRAM 2022 Conference Proceedings- 19th International Conference on Information Systems for Crisis Response and Management
Sankaranarayanan, J, Samet, H, Teitler, BE, Lieberman, MD, Sperling, J (2009) Twitterstand: news in tweets. In: proceedings of the 17th Acm Sigspatial inter- national conference on advances in geographic information systems, pp. 42–51. ACM
Sayyadi H, Raschid L (2013) A graph analytical approach for topic detection. ACM Trans Int Technol (TOIT) 13(2):4
Sayyadi, H, Hurst, M, Maykov, A (2009) Event detection and tracking in social streams. In: Icwsm
Song G, Huang D (2021) A sentiment-aware contextual model for real-time disaster prediction using twitter data. Fut Int 13(7):163
Sun X, Liu L, Ayorinde A, Panneerselvam J (2021) Ed-swe: event detection based on scoring and word embedding in online social networks for the internet of people. Digit Commun Netw 7(4):559–569
Tandoc EC Jr, Johnson E (2016) Most students get breaking news first from twitter. Newsp Res J 37(2):153–166
Unankard, S, Li, X, Sharaf, M, Zhong, J, Li, X (2014) Predicting elections from social networks based on sub-event detection and sentiment analysis. In: International conference on web information systems engineering, pp. 1–16. Springer
Vieweg, S, Hughes, AL, Starbird, K, Palen, L (2010) Microblogging during two nat- ural hazards events: what twitter may contribute to situational awareness. In: proceedings of the SIGCHI conference on human factors in computing systems, pp. 1079–1088. ACM
Watanabe, K, Ochi, M, Okabe, M, Onai, R (2011) Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In: proceedings of the 20th ACM international conference on information and knowledge management, pp. 2541–2544. ACM
Wei, Y, Singh, L (2017) Location-based event detection using geotagged semantic graphs. In: KDD Workshop Mining and Learning with Graphs
Weng, J, Lee, B-S (2011) Event detection in twitter ICWSM 11, 401–408
Yang, H, Chen, S, Lyu, MR, King, I (2011) Location-based topic evolution. In: Pro- ceedings of the 1st international workshop on Mobile location-based service, pp. 89–98. ACM
Zeng, K, Liu, Y, Song, X, Zhou, B (2021) Behind: a 4w-oriented method for event detection from twitter. In: Int Conf Softw Eng Knowl Eng https://doi.org/10.18293/seke2021-092
Zhao, S, Gao, Y, Ding, G, Chua, T-S (2017) Real-time multimedia social event detection in microblog. IEEE transactions on Cybernetics
Zhou S, Kan P, Huang Q, Silbernagel J (2023) A guided latent dirichlet allocation approach to investigate real-time latent topics of twitter data during hurricane laura. J Inf Sci 49(2):465–479
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no known conflict of interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, J., Pandey, D. & Singh, A.K. Event detection from real-time twitter streaming data using community detection algorithm. Multimed Tools Appl 83, 23437–23464 (2024). https://doi.org/10.1007/s11042-023-16263-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16263-3