Abstract
Modeling syntactic information of sentences is essential for neural response generation models to produce appropriate response sentences of high linguistic quality. However, no previous work in conversational responses generation using sequence-to-sequence (Seq2Seq) neural network models has reported to take the sentence syntactic information into account. In this paper, we present two part-of-speech (POS) enhanced models that incorporate the POS information into the Seq2Seq neural conversation model. When training these models, corresponding POS tag is attached to each word in the post and the response so that the word sequences and the POS tag sequences can be interrelated. By the time the word in a response is to be generated, it is constrained by the expected POS tag. The experimental results show that the POS-enhanced Seq2Seq models can generate more grammatically correct and appropriate responses in terms of both perplexity and BLEU measures when compared with the word Seq2Seq model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 583–593 (2011)
Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., Nie, J.-Y., Gao, J., Dolan, B.: A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of NAACL-HLT (2015)
Vinyals, O., Le, Q.: A neural conversational model. In: Proceedings of ICML Deep Learning Workshop (2015)
Shang, L., Zhengdong, L., Li, H.: Neural responding machine for short-text conversation. In: ACL-IJCNLP, pp. 1577–1586 (2015)
Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of AAAI (2016)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2015)
Wen, T.-H., Gasic, M., Mrksic, N., Pei-Hao, S., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of EMNLP, pp. 1711–1721 (2015)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: NAACL-HLT (2016a)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A persona-based neural conversation model. In: Proceedings of ACL (2016b)
Wang, H., Zhengdong, L., Li, H., Chen, E.: A dataset for research on short-text conversations. In: Proceedings of EMNLP, pp. 935–945 (2013)
Yao, K., Zweig, G., Peng, B.: Attention with intention for a neural network conversation model. In: NIPS Workshop on Machine Learning for Spoken Language Understanding and Interaction (2015)
Wen, T.-H., Vandyke, D., Mrksic, N., Gasic, M., Rojas-Barahona, L.M., Pei-Hao, S., Ultes, S., Young, S.: A network-based end-to-end trainable task-oriented dialogue system. arXiv preprint (2016). arXiv:1604.04562
Gu, J., Lu, Z., Li, H., Li, V.O.K.: Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of ACL (2016)
Luan, Y., Ji, Y., Ostendorf, M.: LSTM based conversation models. arXiv preprint (2016). arXiv:1603.09457
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint (2014). arXiv:1412.3555
Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:1406.1078
Serban, I.V., Lowe, R., Charlin, L., Pineau, J.: A survey of available corpora for building data-driven dialogue systems. arXiv preprint (2015). arXiv:1512.05742
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space Odyssey. CoRR, abs/1503.04069 (2015)
Dong, D., Wu, H., He, W., Yu, D., Wang, H.: Multi-task learning for multiple language translation. In: Proceedings of ACL (2015)
Luong, M.-T., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. In: Proceedings of ICLR (2016)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Shang, L., Sakai, T., Zhengdong, L., Li, H., Higashinaka, R., Miyao, Y.: Overview of the NTCIR-12 short text conversation task. In: NTCIR-2012 (2016)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL, pp. 311–318 (2002)
Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., Dolan, B.: deltaBLEU: a discriminative metric for generation tasks with intrinsically diverse targets. CoRR, abs/1506.06863 (2015)
Pietquin, O., Hastie, H.: A survey on metrics for the evaluation of user simulations. Knowl. Eng. Rev. 28(01), 59–73 (2013)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Hoffman, M.D., Blei, D.M., Bach, F.: Online learning for latent Dirichlet allocation. Adv. Neural Inf. Process. Syst. 23, 856–864 (2010)
Acknowledgments
The work described in this paper was supported by National Natural Science Foundation of China (61272291 and 61672445) and The Hong Kong Polytechnic University (G-YBP6, 4-BCB5 and B-Q46C).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Luo, C., Li, W., Chen, Q., He, Y. (2017). A Part-of-Speech Enhanced Neural Conversation Model. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2017. Lecture Notes in Computer Science(), vol 10193. Springer, Cham. https://doi.org/10.1007/978-3-319-56608-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-56608-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56607-8
Online ISBN: 978-3-319-56608-5
eBook Packages: Computer ScienceComputer Science (R0)