Agricultural Domain-Specific Jargon Words Identification in Amharic Text | SpringerLink
Skip to main content

Agricultural Domain-Specific Jargon Words Identification in Amharic Text

  • Conference paper
  • First Online:
Advances of Science and Technology (ICAST 2021)

Abstract

Domain-specific jargon words are lists of words used in formal communication of a particular domain with domain experts and non-domain experts; however, it is difficult to understand by non-experts and society. Experts of an organization use jargon words in scientific and science communication to keep the protocol of the communication within a domain. The domain-specific Amharic jargon words negatively impact people out of the domain experts to understand the main theme of the disseminated content in science communication. We followed a design science research approach to conduct our study. We prepared a knowledge base with a list of domain-specific Amharic Jargon Words and the meaning of the word. Machine learning classifier algorithms are employed for model development with Support Vector Machine, Artificial Neural Network, and Naïve Bayes with TFIDF feature selection that returns a classification accuracy of 96.2%, 95.2%, and 94.7% respectively. The knowledge-based system best performs when a smaller number of test sentences are entered into the system. For the input of 20, 40, 60, and 80 test sentences, an accuracy of 88.2%, 86.7%, 85.4%, and 83.1% is observed. So that with the hybrid of machine learning and knowledge-based, identification of domain-specific Amharic jargon words is performed. Therefore, we observed promised result with the hybrid of machine learning and knowledge base for the identification of jargon words in jargony text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 15729
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sparck Jones, K.: Natural language processing: a historical review. In: Zampolli, A., Calzolari, N., Palmer, M. (eds.) Current Issues in Computational Linguistics: In Honour of Don Walker, pp. 3–16. Springer Netherlands, Dordrecht (1994). https://doi.org/10.1007/978-0-585-35958-8_1

    Chapter  Google Scholar 

  2. Kevitt, P.M., Partridge, D., Wilks, Y.: Approaches to natural language discourse processing. Artif. Intell. Rev. 6(4), 333–364 (1992). https://doi.org/10.1007/BF00123689

    Article  Google Scholar 

  3. Burns, T.W., O’Connor, D.J., Stocklmayer, S.M.: Science communication: a contemporary definition. Public Underst. Sci. 12(2), 183–202 (2003). https://doi.org/10.1177/09636625030122004

    Article  Google Scholar 

  4. Rakedzon, T., Segev, E., Chapnik, N., Yosef, R., Baram-Tsabari, A.: Automatic jargon identifier for scientists engaging with the public and science communication educators. PLoS One 12(8), 1–13 (2017). https://doi.org/10.1371/journal.pone.0181742

    Article  Google Scholar 

  5. Helmreich, S., Llevadias Jané, J., Farwell, D.: Identifying jargon in texts. Identif. Jarg. Texts 35(35), 425–432 (2005)

    Google Scholar 

  6. Ibrahim, M., Gauch, S., Salman, O., Alqahatani, M.: Enriching consumer health vocabulary using enhanced glove word embedding. In: CEUR Workshop Proc., vol. 2619 (2020)

    Google Scholar 

  7. Demeke, M., Ferede, T.: Agricultural Development in Ethiopia : Are There Alternatives to Food Aid? (2014)

    Google Scholar 

  8. Willoughby, S.D., Johnson, K., Sterman, L.: Quantifying scientific jargon. Public Understand. Sci. 29(6), 634–643 (2020). https://doi.org/10.1177/0963662520937436

    Article  Google Scholar 

  9. Weng, W.H., Chung, Y.A., Szolovits, P.: Unsupervised clinical language translation. In: Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 3121–3131 (2019). https://doi.org/10.1145/3292500.3330710

  10. Cyr, A.: Social media: donʼt discount the benefits! Oncol. Times 34(8), 1–3 (2012). https://doi.org/10.1097/01.COT.0000414683.49317.3b

    Article  Google Scholar 

  11. Seyler, D., Liu, W., Wang, X., Zhai, C.: Towards Dark Jargon Interpretation in Underground Forums, pp. 1–8 (2020). Available at: http://arxiv.org/abs/2011.03011

  12. Gong, L., Yang, R., Liu, Q., Dong, Z., Chen, H., Yang, G.: A dictionary-based approach for identifying biomedical concepts. Int. J. Pattern Recognit. Artif. Intell. 31(9), 1–12 (2017). https://doi.org/10.1142/S021800141757004X

    Article  Google Scholar 

  13. Hermawan, R.: Natural language processing with python, vol. 1, no. 1 (2011)

    Google Scholar 

  14. El-Khair, I.A.: Effects of Stop Words Elimination for Arabic Information Retrieval: A Comparative Study (2006, 2017). Available at: http://arxiv.org/abs/1702.01925

  15. Jing, L.P., Huang, H.K., Shi, H.B.: Improved feature selection approach TFIDF in text mining. In: Proc. 2002 Int. Conf. Mach. Learn. Cybern., vol. 2, pp. 944–946 (2002). https://doi.org/10.1109/icmlc.2002.1174522

  16. Dalianis, H.: Evaluation metrics and evaluation. In: Dalianis, H. (ed.) Clinical Text Mining, pp. 45–53. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-78503-5_6

    Chapter  Google Scholar 

  17. Holts, A., Riquelme, C., Alfaro, R.: Automated text binary classification using machine learning approach. In: Proc. Int. Conf. Chil. Comput. Sci. Soc. SCCC, pp. 212–217 (2010). https://doi.org/10.1109/SCCC.2010.30

Download references

Acknowledgment

The routine tasks of this paper are surely granted by the great contribution of agricultural domain experts, erudite, and agrarian society in Ethiopia.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lake, M., Tegegne, T. (2022). Agricultural Domain-Specific Jargon Words Identification in Amharic Text. In: Berihun, M.L. (eds) Advances of Science and Technology. ICAST 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 411. Springer, Cham. https://doi.org/10.1007/978-3-030-93709-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93709-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93708-9

  • Online ISBN: 978-3-030-93709-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics