Abstract
Event extraction is a common task for different applications such as text summarization and information retrieval. We propose, in this work, a TF-IDF based approach for extracting keywords from Arabic news articles’ titles. These keywords will serve to extract the main events for each month using a Part-of-Speech (POS) co-occurrence based approach. The precision values are computed by corresponding the extracted events with another news website. Results show that the approach performance depends on categories and performs well for domain specific ones such as economy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Darwish, K., Magdy, W.: Arabic information retrieval. Found. Trends Inf. Retr. 7(4), 239–342 (2014)
Elayeb, B., Bounhas, I.: Arabic cross-language information retrieval: a review. ACM Trans. Asian Low-Resour. Lang. Inf. Process 15(3), 18:1–18:44 (2016)
Elayeb, B.: Arabic word sense disambiguation: a review. Art. Int. Rev. 50, 1–58 (2018)
Bounhas, I., Elayeb, B., Evrard, F., Slimani, Y.: Organizing contextual knowledge for arabic text disambiguation and terminology extraction. Knowl. Organ. 38(6), 473–490 (2011)
Habash, N., Rambow, O., Roth, R.: MADA+TOKAN: a toolkit for arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. In: Proceedings of MEDAR’2009, pp. 102–109 (2009b)
Hogenboom, F., Frasincar, F., Kaymak, U., De Jong, F., Caron, E.: A Survey of event extraction methods from text for decision support systems. Decision Support Systems 85(C), 12–22 (2016)
Naughton, M., Kushmerick, N., Carthy, J.: Event extraction from heterogeneous news sources. In: Proceedings of AAAI, pp. 1–6 (2006)
Zhou, D., Chen, L., He, Y.: An unsupervised framework of exploring events on twitter: filtering, extraction and categorization. In: Proceedings of AAAI’2015, pp. 2468–2474 (2015)
AL-Smadi, M., Qawasmeh, O.: Knowledge-based approach for event extraction from Arabic tweets. Int. J. Adv. Comput. Sci. Appl. 7(6), 483–490 (2016)
Chouigui, A., Ben Khiroun, O., Elayeb, B.: ANT corpus : an Arabic news text collection for textual classification. In: Proceedings of AICCSA’2017, pp. 135–142 (2017)
Dahab, M.Y., Ibrahim, A.A., Al-Mutawa, R.: A comparative study on Arabic stemmers. Int. J. Comput. Appl. 125(8), 38–47 (2015)
Larkey, L.S., Ballesteros, L., Connell, M.E.: Light stemming for Arabic information retrieval. Arabic Computational Morphology, pp. 221–243. Springer, Dordrecht (2007)
Chowdhury, A., Aljlayl, M., Jensen, E.C., Beitzel, S.M., Grossmanand, D.A., Frieder, O.: Linear combinations based on document structure and varied stemming for Arabic retrieval. In: Proceedings of TREC’2002, pp. 1–12 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Chouigui, A., Khiroun, O.B., Elayeb, B. (2018). A TF-IDF and Co-occurrence Based Approach for Events Extraction from Arabic News Corpus. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science(), vol 10859. Springer, Cham. https://doi.org/10.1007/978-3-319-91947-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-91947-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91946-1
Online ISBN: 978-3-319-91947-8
eBook Packages: Computer ScienceComputer Science (R0)