A Methodology for Evaluating Arabic Machine Translation Systems | Machine Translation Skip to main content
Log in

A Methodology for Evaluating Arabic Machine Translation Systems

  • Published:
Machine Translation

Abstract.

This paper presents a methodology for evaluating Arabic Machine Translation (MT) systems. We are specifically interested in evaluating lexical coverage, grammatical coverage, semantic correctness and pronoun resolution correctness. The methodology presented is statistical and is based on earlier work on evaluating MT lexicons in which the idea of the importance of a specific word sense to a given application domain and how its presence or absence in the lexicon affects the MT system’s lexical quality, which in turn will affect the overall system output quality. The same idea is used in this paper and generalized so as to apply to grammatical coverage, semantic correctness and correctness of pronoun resolution. The approach adopted in this paper has been implemented and applied to evaluating four English-Arabic commercial MT systems. The results of the evaluation of these systems are presented for the domain of the Internet and Arabization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Akiba Y., E. Sumita, H. Nakaiwa, S. Yamamoto., H. Okuno 2002, ’Experimental Comparison of MT Evaluation Methods: RED versus BLEU’. In Proceedings of the MT Summit IX, New Orleans, pp. 1-8.

  • Al-Jundi F. 1997, [‘Al-Mutarjim Al-Arabey An Attempt to Understand English’], PC Magazine (Middle East) october, 40-44.

  • Andrewsky A. 1978, ‘Le problème de l’évaluation d’une traduction automatique’, [‘The problem of evaluating a machine translation’],CEC Memorandum, February 1978.

  • Anon. 1996, [‘The Machine Translator: Al-Wafi’], Arabuter 8.71, 27-28.

    Google Scholar 

  • InstitutionalAuthorNameArab.Net Technology Ltd. (1996) Arabtrans User’s Guide Arab Press House Simi Valley,California

    Google Scholar 

  • Arnold, D.J., R.L. Humphreys., L. Sadler (eds) 1993, ‘Special Issue on Evaluation of MT Systems’. Machine Translation VOL:1-2.

  • ATA 1997, Al-Mutarjim Al-Arabey User manual. http://www.almisbar.com/salam_trans_a.html, [accessed 11 February 2005].

  • Carroll J, T. Briscoe 1998, ’A Survey of Parser Evaluation Methods,’ in Proceedings of the Workshop on the Evaluation of Parsing Systems, University of Sussex.

  • Chaumier J., M.C. Mallen., G. van Slype 1977, ’Evaluation du système de traduction automatique SYSTRAN; Evaluation de la qualité de la traduction’ [Evaluation of the SYSTRAN machine translation system; translation quality evaluation],CEC Report No 4, Luxembourg.

  • Culy C., S.Z. Riehemann 2003, ’The Limits of N-Gram Translation Evaluation Metrics,’ in Proceedings of the MT Summit IX, New Orleans, pp. 133-138.

  • M.C. Dyson J. Hannah (1987) ArticleTitle‘Towards a Methodology for the Evaluation of Machine-Assisted Translation Systems’ Computers and Translation 2 163–176

    Google Scholar 

  • A. Guessoum R. Zantout (2000) ArticleTitle’Arabic Machine Translation: A Strategic Choice for the Arab World’ KSU Computer and Information Sciences Journal 12 117–144

    Google Scholar 

  • A. Guessoum R. Zantout (2001a) ArticleTitle‘A Methodology for a semi-automatic evaluation of the language coverage of machine translation system lexicons’ Machine Translation 16 127–149 Occurrence Handle10.1023/A:1014504808954

    Article  Google Scholar 

  • Guessoum A., R. Zantout 2001, ‘Semi-Automatic Evaluation of the Grammatical Coverage of Machine Translation Systems’. inProceedings of the MT Summit VIII, Santiago de Compostela, Spain, pp. 133-138.

  • T.C. Halliday E.A. Briss (Eds) (1977) The Evaluation and Systems Analysis of the Systran Machine Translation System Rome Air Development Center, Griffiss Air Force Base New York

    Google Scholar 

  • S. Hedberg (1994) ArticleTitle‘Machine Translation Comes of Age’ AI Expert 9 IssueID10 37

    Google Scholar 

  • Hovy E, M. King, A. Popescu-Belis 2002, ‘An Introduction to Machine Translation Evaluation,’ in Proceedings of the Workshop at the LREC 2002 Conference, Las Palmas, Spain, pp. 1-7.

  • W.J Hutchins H.L. Somers (1992) An Introduction to Machine Translation Academic Press London

    Google Scholar 

  • Jihad, A. (1996), [’Has the Arabic Machine ranslation Era Started?’], Byte Middle East November, 36-48.

  • D. Jurafsky J.H. Martin (2000) Speech Processing and Language Processing Prentice Hall Upper Saddle River NJ

    Google Scholar 

  • J. Klein S. Lehmann K. Netter T. Wegst (1998) ‘DiET in the Context of MT Evaluation’ B. Schröder W. Lenders W. Hess T. Portele (Eds) Computer Linguistik und Phonetik zwischen Sprache und Sprechen, Computers, Linguistics, Phonetics between Language and Speech Peter Lang Bern 107–126

    Google Scholar 

  • King, M. and K. Falkedal (1990), ’Using Test Suites in Evaluation of Machine Translation, Systems,’ inCOLING 1990, Proceedings of the 13th International Conference on Computational Linguistics, Helsinki, Vol. 2, pp. 211-216.

  • King, M., B. Maegaard, J. Schultz, L. des Tombe, A. Bech, A. Neville, A. Arppe, L. Balkan, C. Brace, H. Bunt, L. Carlson, S. Douglas, M. Höge, S. Krauwer, S. Manzi, C. Mazzi, A.J. Sielemann., R. Steenbakkers (1996), EAGLES - Evaluation of Natural Language Processing Systems, final report, EAG-EWG-PR.2, October 1996.

  • J. Lehrberger L. Bourbeau (1988) Machine Translation: Linguistic Characteristics of MT Systems and General Methodology of Evaluation John Benjamins Amsterdam

    Google Scholar 

  • J. Mason A. Rinsche (1995) Ovum Evaluates: Translation Technology Products OVUM Ltd London

    Google Scholar 

  • Melby A.K. (1988), ‘Lexical Transfer: Between a Source Rock and a Hard Target’, inProceedings of the 12th International Conference on Computational Linguistics (COLING), Budapest, pp. 411-419.

  • C. Mellish R. Dale (1998) ArticleTitle‘Evaluation in the Context of Natural Language Generation’ Journal of Computer Speech and Language 12 349–373 Occurrence Handle10.1006/csla.1998.0106

    Article  Google Scholar 

  • M. Nagao (1985) ArticleTitle‘Evaluation of the Quality of Machine-Translated Sentences and the Control of Language’ Journal of the Information Processing Society of Japan 26 1197–1202

    Google Scholar 

  • Nyberg E.H., Mitamura T., Carbonell J.G. (1992), ’The KANT System: Fast, Accurate, High-Quality Translation in Practical Domains,’ in Proceedings of the fifteenth [sic] International Conference on Computational Linguistics, COLING ’92, Nantes, France, pp. 1069-1073.

  • Nyberg, E.H., T. Mitamura., J.G. Carbonell (1994), ’Evaluation Metrics for Knowledge-Based Machine Translation’, COLING 1994, 15th International Conference on Computational Linguistics, Kyoto, pp. 95-99.

  • Papineni, K., S. Roukos, T. Ward., W-J. Zhu (2002), ’BLEU: A Method for Automatic Evaluation of Machine Translation,’ in 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, pp. 311-318.

  • Qendelft, G. (1997), [The Translation Program Al-Wafi Is Useful for Getting a General Understanding of a Letter Written in English], (Al-Hayat), 25 October 1997.

  • H.W. Sinaiko G.R. Klare (1972) ArticleTitle’Further Experiments in Language Translation: Readability of Computer Translations’ ITL 15 1–29

    Google Scholar 

  • H.W. Sinaiko G.R. Klare (1973) ArticleTitle’Further Experiments in Language Translation: A Second Evaluation of the Readability of Computer Translations’ ITL 19 29–52

    Google Scholar 

  • G. Slype Particlevan (1979a) ArticleTitle’Systran: Evaluation of the 1978 Version of the Systran English-French Automatic System of the Commission of the European Communities’ The Incorporated Linguist 18 86–89

    Google Scholar 

  • G. Slype Particlevan (1979b) Critical Study of Methods for Evaluating the Quality of Machine Translation (Final Report), Prepared for the Commission of the European Communities Bureau Marcel van Dyke Brussels

    Google Scholar 

  • M. Vasconcellos (Eds) (1988) Technology as Translation Strategy State University of New York at Binghampton (SUNY) Binghampton NY

    Google Scholar 

  • White, J., T. O’Connell, and F. O’Mara: 1994, ’The ARPA MT Evaluation Methodologies: Evolution, Lessons, and Future Approaches,’ in Technology Partnerships for Crossing the Language Barrier: Proceedings of the First Conference of the Association for Machine Translation in the Americas, Columbia, Maryland, pp. 193-205.

  • Y. Wilks (1991) ’Systran: It Obviously Works, but How Much Can It Be Improved?’ Computer Research Laboratory, New Mexico State University Las Cruces

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Guessoum.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guessoum, A., Zantout, R. A Methodology for Evaluating Arabic Machine Translation Systems. Mach Translat 18, 299–335 (2004). https://doi.org/10.1007/s10590-005-2412-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-005-2412-3

Keywords

Navigation