Location Extraction in Disaster Tweets with a Model Trained on Past Data: Diverse Analysis

Rokuse, Toshihiro; Utsu, Keisuke; Uchida, Osamu

doi:10.1007/978-3-031-64037-7_9

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 706))

Included in the following conference series:

International Conference on Information Technology in Disaster Risk Reduction

163 Accesses
1 Altmetric

Abstract

Rapid and accurate collection and dissemination of information is essential to minimize damage in a large-scale disaster. Governments and local authorities responsible for these tasks actively use immediate platforms like Twitter (now X) to gather and share information. However, the volume of information on social media increases rapidly during a large-scale disaster. It becomes necessary to swiftly select crucial, urgent information from many tweets. Identifying the locations mentioned in these tweets is also essential to facilitate decision-making by disaster responders. Considering these perspectives, quickly and manually sorting through the massive volume of posts is not easy, and attempts are being made to employ machine learning models for the sorting process. However, disaster response requires a rapid reaction, while machine learning models need high-quality training data to perform effectively. This study considers using posts circulated during past disasters to resolve these conflicting issues. A research question addressed is whether the type of disaster affects the accuracy of extracting location mentions using data from past disasters. This paper reports the verification results for the same and different disaster types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 9151; Price includes VAT (Japan)

Hardcover Book: JPY 11439; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Location Mention Recognition from Japanese Disaster-Related Tweets

Event classification and location prediction from tweets during disasters

Article Open access 19 May 2017

Pairing Tweets with the Right Location

References

Saleem, H., Zamal, F., Ruths, D.: Tackling the challenges of situational awareness extraction in twitter with an adaptive approach. Proc. Eng. 107, 301–311 (2015). https://doi.org/10.1016/j.proeng.2015.06.085
Article Google Scholar
Meier, P.: Digital Humanitarians. Routledge (2015). https://doi.org/10.1201/b18023
Book Google Scholar
Osamu, U., Keisuke, U.: Utilization of social media at the time of disaster. IEICE ESS Fundam. Rev. 13(4), 301–311 (2020). https://doi.org/10.1587/essfr.13.4_301
Article Google Scholar
Yamada, S., Utsu, K., Uchida, O.: An Analysis of Tweets During the 2018 Osaka North Earthquake in Japan -A Brief Report. In: 2018 5th International Conference on Information and Communication Technologies for Disaster Management (ICT-DM). pp. 1–5 (2018). https://doi.org/10.1109/ICT-DM.2018.8636393
Villegas, C., Martinez, M., Krause, M.: Lessons from Harvey: Crisis Informatics for Urban Resilience. Rice University Kinder Institute for Urban Research (2018). https://doi.org/10.25611/np4y-3bil
Uchida, O., et al.: Miller, M: classification and mapping of disaster relevant tweets for providing useful information for victims during disasters. IIEEJ Trans. Image Electron. Vis. Comput. 3, 224–232 (2015)
Google Scholar
Suwaileh, R., Imran, M., Elsayed, T., Sajjad, H.: Are we ready for this disaster? Towards location mention recognition from crisis tweets. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 6252–6263. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020)
Google Scholar
Rokuse, T., Uchida, O.: Location Mention Recognition from Japanese Disaster-Related Tweets. In: Gjøsæter, T., Radianti, J., and Murayama, Y. Information Technology in Disaster Risk Reduction. pp. 293–307. Springer Nature Switzerland, Cham (2023)https://doi.org/10.1007/978-3-031-34207-3_19
Olteanu, A., Castillo, C., Diaz, F., Vieweg, S.: CrisisLex: A lexicon for collecting and filtering Microblogged communications in crises. Proc. Int. AAAI Conf. Web Soc. Media 8(1), 376–385 (2014). https://doi.org/10.1609/icwsm.v8i1.14538
Article Google Scholar
Imran, M., Mitra, P., Castillo, C.: Twitter as a Lifeline: Human-annotated twitter corpora for NLP of crisis-related messages. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 1638–1643. European Language Resources Association (ELRA), Portorož, Slovenia (2016)
Google Scholar
Cobo, A., Parra, D., Navón, J.: Identifying relevant messages in a twitter-based citizen channel for natural disaster situations. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1189–1194. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2740908.2741719
Alharbi, A., Lee, M.: Kawarith: an Arabic twitter corpus for crisis events. In: Proceedings of the sixth arabic natural language processing workshop, pp. 42–52. Association for computational linguistics, Kyiv, Ukraine (Virtual) (2021)
Google Scholar
Sarioglu Kayi, E., Nan, L., Qu, B., Diab, M., McKeown, K.: Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages. In: Proceedings of the 28th international conference on computational linguistics. pp. 4693–4703. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020)
Google Scholar
Ray Chowdhury, J., Caragea, C., Caragea, D.: Cross-lingual disaster-related multi-label tweet classification with manifold Mixup. In: Proceedings Of the 58th Annual Meeting of the Association For Computational Linguistics: student research Workshop, pp. 292–298. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-srw.39
Ray Chowdhury, J., Caragea, C., Caragea, D.: Keyphrase extraction from disaster-related tweets. In: The World Wide Web Conference, pp. 1555–1566. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3308558.3313696
Al-Olimat, H., Thirunarayan, K., Shalin, V., Sheth, A.: location name extraction from targeted text streams using gazetteer-based statistical language models. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1986–1997. Association for Computational Linguistics, Santa Fe, New Mexico, USA (2018)
Google Scholar
Medina Maza, S., Spiliopoulou, E., Hovy, E., Hauptmann, A.: Event-related bias removal for real-time disaster events. In: Findings of the association for computational linguistics: EMNLP 2020, pp. 3858–3868. Association for Computational Linguistics, Online (2020)
Google Scholar
Suwaileh, R., Elsayed, T., Imran, M., Sajjad, H.: When a disaster happens, we are ready: location mention recognition from crisis tweets. Int. J. Disaster Risk Reduction. 78, 103107 (2022). https://doi.org/10.1016/j.ijdrr.2022.103107
Article Google Scholar
Martínez-García, A., Badia, T., Barnes, J.: Evaluating morphological typology in zero-shot cross-lingual transfer. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Vol 1: Long Papers), pp. 3136–3153. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.244
Hamoui, B., Mars, M., Almotairi, K.: FloDusTA: Saudi tweets dataset for flood, dust storm, and traffic accident events. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 1391–1396. European Language Resources Association, Marseille, France (2020)
Google Scholar
DISAANA. https://disaana.jp/
D-SUMM. https://disaana.jp/d-summ/
Yamada, S., Utsu, K., Uchida, O.: An analysis of tweets posted during 2018 western japan heavy rain disaster. In: 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 1–8 (2019). https://doi.org/10.1109/BIGCOMP.2019.8679346
Yamamoto, F., Suzuki, Y., Nadamoto, A.: Extraction and analysis of regionally specific behavioral facilitation information in the event of a large-scale disaster. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 538–543. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3486622.3493991
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 759–768. Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1871437.1871535
Sakaki T., Matsuno S., Hino Y.: Analysis on geographic bias in private graphs on twitter towards SNS marketing applications. IEICE Technical Report; IEICE Tech. Rep. 121, 25–30 (2021)
Google Scholar
Gelernter, J., Balaji, S.: An algorithm for local geoparsing of microtext. GeoInformatica 17, 635–667 (2013). https://doi.org/10.1007/s10707-012-0173-8
Article Google Scholar
Kumar, A., Singh, J.P.: Deep neural networks for location reference identification from bilingual disaster-related tweets. IEEE Trans. Comput. Soc. Syst. 11(1), 880–891 (2024). https://doi.org/10.1109/TCSS.2022.3213702
Article Google Scholar
Davari, M., Kosseim, L., Bui, T.: TIMBERT: Toponym Identifier for the medical domain based on BERT. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 662–668. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020). https://doi.org/10.18653/v1/2020.coling-main.58
Yang, J., Liang, S., Zhang, Y.: Design challenges and misconceptions in neural sequence labeling. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3879–3889. Association for Computational Linguistics, Santa Fe, New Mexico, USA (2018)
Google Scholar
Paul, U., Ermakov, A., Nekrasov, M., Adarsh, V., Belding, E.: #Outage: detecting power and communication outages from social networks. In: Proceedings of the Web Conference 2020, pp. 1819–1829. ACM, Taipei Taiwan (2020). https://doi.org/10.1145/3366423.3380251
Matsuda, K., Sasaki, A., Okazaki, N., Inui, K.: annotating geographical entities on microblog text. In: Proceedings of the 9th Linguistic Annotation Workshop, pp. 85–94. Association for Computational Linguistics, Denver, Colorado, USA (2015). https://doi.org/10.3115/v1/W15-1609
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423
Xu, C., Ge, T., Li, C., Wei, F.: UnihanLM: Coarse-to-fine Chinese-Japanese language model pretraining with the Unihan database. In: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pp. 201–211. Association for Computational Linguistics, Suzhou, China (2020)
Google Scholar
Koto, F., Rahimi, A., Lau, J.H., Baldwin, T.: IndoLEM and IndoBERT: A benchmark dataset and pre-trained language model for Indonesian NLP. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 757–770. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020). https://doi.org/10.18653/v1/2020.coling-main.66
Antoun, W., Baly, F., Hajj, H.: AraBERT: transformer-based model for Arabic language understanding. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pp. 9–15. European language resource association, Marseille, France (2020)
Google Scholar
Kato, T., Miyata, R., Sato, S.: BERT-based simplification of japanese sentence-ending predicates in descriptive text. In: Proceedings of the 13th International Conference on Natural Language Generation, pp. 242–251. Association for Computational Linguistics, Dublin, Ireland (2020)
Google Scholar
Chen, W.-T., Xia, Y., Shinzato, K.: Extreme Multi-label classification with label masking for product attribute value extraction. In: Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), pp. 134–140. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.ecnlp-1.16
Nakayama, Y., Murakami, K., Kumar, G., Bhingardive, S., Hardaway, I.: A large-scale japanese dataset for aspect-based sentiment analysis. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 7014–7021. European Language Resources Association, Marseille, France (2022)
Google Scholar
Liu, Y., et al.: RoBERTa: A robustly optimized BERT Pretraining approach, http://arxiv.org/abs/1907.11692, (2019). https://doi.org/10.48550/arXiv.1907.11692
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, http://arxiv.org/abs/1909.11942, (2020). https://doi.org/10.48550/arXiv.1909.11942
Xiao, Z., Blanco, E.: Are people located in the places they mention in their tweets? a multimodal approach. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 2561–2571. International Committee on Computational Linguistics, Gyeongju, Republic of Korea (2022)
Google Scholar
Khanal, S., Caragea, D.: Multi-task learning to enable location mention identification in the early hours of a crisis event. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4051–4056. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.340
Wang, L., Gao, C., Wei, J., Ma, W., Liu, R., Vosoughi, S.: An empirical survey of unsupervised text representation methods on twitter data. In: Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pp. 209–214. Association for Computational Linguistics, Online (2020)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, http://arxiv.org/abs/1301.3781, (2013)
Nguyen, D.Q., Vu, T., Tuan Nguyen, A.: BERTweet: a pre-trained language model for English tweets. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 9–14. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.2
Kawintiranon, K., Singh, L.: PoliBERTweet: A pre-trained language model for analyzing political content on twitter. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 7360–7367. European Language Resources Association, Marseille, France (2022)
Google Scholar
Grace, R.: Toponym usage in social media in emergencies. Int. J. Disaster Risk Reduction. 52, 101923 (2021). https://doi.org/10.1016/j.ijdrr.2020.101923
Article Google Scholar
Suwaileh, R., Imran, M., Elsayed, T.: IDRISI-RA: the first Arabic location mention recognition dataset of disaster tweets. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), pp. 16298–16317. Association for Computational Linguistics, Toronto, Canada (2023)
Google Scholar

Download references

Acknowledgments

This research was supported by JSPS KAKENHI Grant Number 22K12277.

Author information

Authors and Affiliations

Department of Information Media Technology, Tokai University, Hiratsuka, Kanagawa, Japan
Toshihiro Rokuse, Keisuke Utsu & Osamu Uchida

Authors

Toshihiro Rokuse
View author publications
You can also search for this author in PubMed Google Scholar
Keisuke Utsu
View author publications
You can also search for this author in PubMed Google Scholar
Osamu Uchida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshihiro Rokuse .

Editor information

Editors and Affiliations

University Grenoble Alps, Saint-Martin-d’Hères, France
Julie Dugdale
University of Agder, Kristiansand, Norway
Terje Gjøsæter
Tokai University, Hiratsuka, Japan
Osamu Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rokuse, T., Utsu, K., Uchida, O. (2024). Location Extraction in Disaster Tweets with a Model Trained on Past Data: Diverse Analysis. In: Dugdale, J., Gjøsæter, T., Uchida, O. (eds) Information Technology in Disaster Risk Reduction. ITDRR 2023 2023. IFIP Advances in Information and Communication Technology, vol 706. Springer, Cham. https://doi.org/10.1007/978-3-031-64037-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-64037-7_9
Published: 30 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64036-0
Online ISBN: 978-3-031-64037-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)