Neural Machine Translation Based on Attention Convolution

Computer Science ›› 2018, Vol. 45 ›› Issue (11): 226-230.doi: 10.11896/j.issn.1002-137X.2018.11.035

• Artificial Intelligence • Previous Articles     Next Articles

Neural Machine Translation Based on Attention Convolution

WANG Qi, DUAN Xiang-yu   

  1. (School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Received:2018-04-18 Published:2019-02-25

Abstract: The attention mechanism commonly used by the existing neural machine translation is based on the word level.By creating multi-layer convolutional structure on the basis of attention mechanism,this paper improved attention mecha-nism from word-based level to phrase-based level.After convolutional operation,the attention information can reflect phrase structure more clearly and generate new context vectors.Then,the new context vectors are used to integrate into the neural machine translation framework.Experimental results on large-scale Chinese-to-English tasks show that neural machine translation based on attention convolution can effectively capture the phrasal information in statements,enhance the context dependencies of translated words,optimize the context vectors and improve the translation quality.

Key words: Attention mechanism, Multi-layer convolutional structure, Neural machine translation, Phrase-based level

CLC Number: 

  • TP391
[1]FENG Z W.Studies of SCI-Tech Translation[M].Beijing:China Translation Corporation,2004.(in Chinese)
冯志伟.机器翻译研究[M].北京:中国对外翻译出版公司,2004.
[2]宗成庆.统计自然语言处理[M].北京:清华大学出版社,2008.
[3]LIU Q.Syntax-based Statistical Machine Transaltion Models and Approaches[J].Journal of Chinese Information Processing,2011,25(6):63-71.(in Chinese)
刘群.基于句法的统计机器翻译模型与方法[J].中文信息学报,2011,25(6):63-71.
[4]KOEHN P,OCH F J,MARCU D.Statistical Phrase-based Translation[C]∥Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1.Associa-tion for Computational Linguistics,2003:48-54.
[5]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J].arXiv:1409.0473,2014.
[6]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[J].arXiv:1508.04025,2015.
[7]SUTSKEVER I,VINYALS O,LE Q V.Sequence to Sequence Learning with Neural Networks[C]∥Advances in Neural Information Processing Systems.2014:3104-3112.
[8]LI Y C,XIONG D Y,ZHANG M.A survey of Neural Machine Translation[J/OL].Chinese Journal of Computers.http://cjc.ict.ac.cn/online/bfpub/lyc-20171229152034.pdf.(in Chinese)
李亚超,熊德意,张民.神经机器翻译综述[J/OL].计算机学报.http://cjc.ict.ac.cn/online/bfpub/lyc-20171229152034.pdf.
[9]GEHRING J,AULI M,GRANGIER D,et al.Convolutional Sequence to Sequence Learning[J].arXiv preprint arXiv:1705.03122,2017.
[10]XU K,BA J,KIROS R,et al.Show,Attend and Tell:Neural Ima- ge Caption Generation with Visual Attention[C]∥International Conference on Machine Learning.2015:2048-2057.
[11]CHENG Y,WU H,WU H,et al.Agreement-based joint trai- ning for bidirectional attention-based neural machine translation[J].arXivpreprint arXiv:1512.04650,2015.
[12]TU Z,LU Z,LIU Y,et al.Modeling Coverage for Neural Machine Translation[J].arXiv preprint arXiv:1601.04811,2016.
[13]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need[C]∥Advances in Neural Information Processing Systems.2017:5998-6008.
[14]CHO K,MERRIENBOER B,GULECHRE C,et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].arXiv preprint arXiv:1406.1078,2014.
[15]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[16]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computiton,1997,9(8):1735-1780.
[17]NEUBIG G,DYER C,GOLDBERG Y,et al.DyNet:The Dynamic Neural Network Toolkit[J].arXiv:1701.03980.
[18]NEUBIG G.Lamtram:A toolkit for language and translation modeling using neural networks[OL].http://www.github.com/neubig/lamtram,2015.
[19]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a Method for Automatic Evaluation of Machine Translation[C]∥Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2002:311-318.
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[5] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[6] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[7] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[9] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[10] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[11] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[12] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[13] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[14] MENG Yue-bo, MU Si-rong, LIU Guang-hui, XU Sheng-jun, HAN Jiu-qiang. Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism [J]. Computer Science, 2022, 49(7): 142-147.
[15] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!