Low-Resource Machine Translation with Different Granularity Image Features | SpringerLink
Skip to main content

Low-Resource Machine Translation with Different Granularity Image Features

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15035))

Included in the following conference series:

  • 108 Accesses

Abstract

Visual content improves the alignments in the language latent spaces since the physical visual sensation is similar for people speaking different languages. Therefore, some researchers have recently proposed an unsupervised multimodal machine translation (UMMT) method for low-resource settings, which leverages images as pseudo-pivots to facilitate latent space alignment. However, they only consider region or grid image features in high-resource close language pairs (CLP), e.g., English-German (En-De) and English-French (En-Fr), which ignores the effect of applying more informative features to UMMT in low-resource distant language pairs (DLP), e.g., Chinese-Uyghur (Zh-Uy) and English-Uyghur (En-Uy). In this paper, we exploit a pre-training language model and a UMMT model with different granularity of image features and study the influence of image features on DLP and CLP translation. The experimental results on the CLP dataset Multi30k and the DLP dataset Multi30K-Zh-Uy show that the proposed approach has significantly improved over the state-of-the-art methods. The code is available at https://github.com/Turghuns/UMMT-DGIF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.statmt.org/wmt18/multimodal-task.html.

  2. 2.

    https://nlp.stanford.edu/software/stanford-segmenter-2018-10-16.zip.

  3. 3.

    https://github.com/moses-smt/mosesdecoder.

  4. 4.

    https://github.com/glample/fastBPE.

References

  1. Artetxe, M., Labaka, G., Agirre, E., Cho, K.: Unsupervised neural machine translation. In: International Conference on Learning Representations, pp. 1–12 (2018)

    Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, pp. 1–15 (2015)

    Google Scholar 

  3. Caglayan, O., Kuyu, M., Amac, M.S., Madhyastha, P., Erdem, E., Erdem, A., Specia, L.: Cross-lingual visual pre-training for multimodal machine translation. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, pp. 1317–1324 (2021)

    Google Scholar 

  4. Chang, P., Galley, M., Manning, C.D.: Optimizing chinese word segmentation for machine translation performance. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp. 224–232 (2008)

    Google Scholar 

  5. Chen, S., Jin, Q., Fu, J.: From words to sentences: A progressive learning approach for zero-resource machine translation with visual pivots. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 4932–4938 (2019)

    Google Scholar 

  6. Chen, Y., Liu, Y., Cheng, Y., Li, V.O.K.: A teacher-student framework for zero-resource neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1925–1935 (2017)

    Google Scholar 

  7. Chen, Y., Liu, Y., Li, V.O.K.: Zero-resource neural machine translation with multi-agent communication game. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, pp. 5086–5093 (2018)

    Google Scholar 

  8. Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches. In: Proceedings of SSST@EMNLP 2014, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111 (2014)

    Google Scholar 

  9. Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, pp. 7057–7067 (2019)

    Google Scholar 

  10. Elliott, D., Frank, S., Sima’an, K., Specia, L.: Multi30k: multilingual English-German image descriptions. In: Proceedings of the 5th Workshop on Vision and Language, pp. 70–74 (2016)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  12. Huang, P., Sun, S., Yang, H.: Image-assisted transformer in zero-resource multi-modal translation. In: International Conference on Acoustics, Speech and Signal Processing, pp. 7548–7552 (2021)

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Computer Vision and Pattern Recognition, pp. 1–15 (2015)

    Google Scholar 

  14. Lample, G., Conneau, A., Denoyer, L., Ranzato, M.: Unsupervised machine translation using monolingual corpora only. In: 6th International Conference on Learning Representations, pp. 1–14 (2018)

    Google Scholar 

  15. Lample, G., Ott, M., Conneau, A., Denoyer, L., Ranzato, M.: Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5039–5049 (2018)

    Google Scholar 

  16. Läubli, S., Sennrich, R., Volk, M.: Has machine translation achieved human parity? A case for document-level evaluation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4791–4796 (2018)

    Google Scholar 

  17. Lavie, A., Agarwal, A.: METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: The Second Workshop on Statistical Machine Translation, pp. 228–231 (2007)

    Google Scholar 

  18. Li, L., Tayir, T., Han, Y., Tao, X., Velásquez, J.D.: Multimodality information fusion for automated machine translation. Inf. Fusion 91, 352–363 (2023)

    Article  Google Scholar 

  19. Li, L., Tayir, T., Hu, K., Zhou, D.: Multi-modal and multi-perspective machine translation by collecting diverse alignments. In: PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, pp. 311–322 (2021)

    Google Scholar 

  20. Nakayama, H., Nishida, N.: Zero-resource machine translation by multimodal encoder-decoder network with multimedia pivot. Mach. Transl. 31(1–2), 49–64 (2017)

    Article  Google Scholar 

  21. Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  22. Popel, M., Tomkova, M., Tomek, J., Kaiser, Ł, Uszkoreit, J., Bojar, O., Žabokrtskỳ, Z.: Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals. Nat. Commun. 11(1), 4381 (2020)

    Google Scholar 

  23. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Conference on Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  24. Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 86–96 (2016)

    Google Scholar 

  25. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725 (2016)

    Google Scholar 

  26. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas, pp. 223–231 (2006)

    Google Scholar 

  27. Su, Y., Fan, K., Bach, N., Kuo, C.J., Huang, F.: Unsupervised multi-modal neural machine translation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10482–10491 (2019)

    Google Scholar 

  28. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  29. Tayir, T., Li, L.: Unsupervised multimodal machine translation for low-resource distant language pairs. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 23(4) (2024)

    Google Scholar 

  30. Tayir, T., Li, L., Li, B., Liu, J., Lee, K.A.: Encoder-decoder calibration for multimodal machine translation. IEEE Trans. Artif. Intell. 1–9 (2024)

    Google Scholar 

  31. Toral, A., Castilho, S., Hu, K., Way, A.: Attaining the unattainable? reassessing claims of human parity in neural machine translation. In: Proceedings of the Third Conference on Machine Translation, pp. 113–123 (2018)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by NSFC, China (No.62276196).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tayir, T., Li, L., Maimaiti, M., Muhtar, Y. (2025). Low-Resource Machine Translation with Different Granularity Image Features. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15035. Springer, Singapore. https://doi.org/10.1007/978-981-97-8620-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-8620-6_18

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-8619-0

  • Online ISBN: 978-981-97-8620-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics