Abstract
Word stemming algorithm is a natural language morphogical process of reducing derived words to their respective root words. Due to the importance of word stemming algorithm, many Malay word stemming algorithms have been developed in the past years. However, previous researchers only focused on improving affixation word stemming with various stemming approaches. There is no reduplication word stemming has been developed for Malay language thus far. In Malay language, affixation and reduplication are derived words in which have their own morphological rules. Therefore, the use of affixation word stemming to stem reduplication words is considered inappropriate. Hence this paper presents the proposed reduplication word stemming algorithm to stem full, rhythmic and partial reduplication words to their respective root words. This proposed stemming algorithm uses Rules Application Order with Stemming Errors Reducer to stem these reduplication words. Malay online newspaper articles have been used to evaluate this proposed stemming algorithm. The experimental results showed that the proposed stemming algorithm able to stem full, rhythmic, affixed and partial reduplication with better stemming accuracy. Hence, the future improvement of Malay word stemming algorithm should include affixation and reduplication word stemming.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abdullah, M.T., Ahmad, F., Mahmod, R., Sembok, T.M.T.: Rules frequency order stemmer for Malay language. IJCSNS International Journal of Computer Science and Network Security 9(2), 433–438 (2009)
Ahmad, F., Yusoff, M., Sembok, T.M.: Experiments with a Stemming Algorithm for Malay Words. Journal of the American Society for Information Science 47(12), 909–918 (1996)
Darwis, S.A., Abdullah, R., Idris, N.: Exhaustive Affix Stripping And A Malay Word Register To Solve Stemming Errors And Ambiguity Proble. In: Malay Stemmers. Malaysian Journal of Computer Science (2012)
Fadzli, S.A., Norsalehen, A.K., Syarilla, I.A., Hasni, H., Dhalila, M.S.S.: Simple Rules Malay Stemmer. In: The International Conference on Informatics and Applications (ICIA 2012), pp. 28–35. The Society of Digital Information and Wireless Communication (2012)
Hassan, A.: Morfologi, vol. 13. PTS Professional (2006)
Idris, N., Syed, S.M.F.D.: Stemming for Term Conflation in Malay Texts. In: International Conference on Artificial Intelligence, ICAI 2001 (2001)
Leong, L.C., Basri, S., Alfred, R.: Enhancing Malay Stemming Algorithm with Background Knowledge. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS, vol. 7458, pp. 753–758. Springer, Heidelberg (2012)
Othman, A.: Pengakar Perkataan Melayu untuk Sistem Capaian Dokumen. MSc Thesis. Universiti Kebangsaan Malaysia, Bangi (1993)
Ranaivo-Malancon, B.: Computational Analysis of Affixed Words in Malay Language. In: Proceedings of the 8th International Symposium on Malay/Indonesian Linguistics, Penang, Malaysia (2004)
Sankupellay, M., Valliappan, S.: Malay Language Stemmer. Sunway Academic Journal 3, 147–153 (2006)
Sembok, T.M.T., Yussoff, M., Ahmad, F.: A Malay Stemming Algorithm for Information Retrieval. In: Proceedings of the 4th International Conference and Exhibition on Multi-Lingual Computing, vol. 5, pp. 2–1 (1994)
Sharum, M.Y., Abdullah, M.T., Sulaiman, M.N., Murad, M.A., Hamzah, Z.Z.: MALIM - A new computational approach of Malay morphology. In: 2010 International Symposium on Information Technology (ITSim), vol. 2, pp. 837–843 (2010)
Tai, S.Y., Ong, C.S., Abdullah, N.A.: On Designing An Automated Malaysian Stemmer For The Malay Language. In: Proceedings of the Fifth International Workshop on Information Retrieval With Asian Languages, pp. 207–208. ACM (2000)
Yasukawa, M., Lim, H.T., Yokoo, H.: Stemming Malay Text and Its Application in Automatic Text Categorization. IEICE Transactions on Information and Systems 92(12), 2351–2359 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kassim, M.N., Maarof, M.A., Zainal, A. (2014). Enhanced Rules Application Order Approach to Stem Reduplication Words in Malay Texts. In: Herawan, T., Ghazali, R., Deris, M. (eds) Recent Advances on Soft Computing and Data Mining. Advances in Intelligent Systems and Computing, vol 287. Springer, Cham. https://doi.org/10.1007/978-3-319-07692-8_62
Download citation
DOI: https://doi.org/10.1007/978-3-319-07692-8_62
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07691-1
Online ISBN: 978-3-319-07692-8
eBook Packages: EngineeringEngineering (R0)