Abstract
In Automatic Speech Recognition applications, Natural Language Processing (NLP) has sub-tasks of predicting the Intent and Slots for the utterance spoken by the user. Researchers have done a lot of work in this field using Recurrent-Neural-Networks (RNN), Convolution Neural Network (CNN) and attentions based models. However, all of these use either separate independent models for both intent and slot or sequence-to-sequence type networks. They might not take full advantage of relation between intent and slot learning. We are proposing a unified parallel architecture where a CNN Network is used for Intent Prediction and Bidirectional LSTM is used for Slot Prediction. We used Cross Fusion technique to establish relation between Intent and Slot learnings. We also used masking for slot prediction along with cross fusion. Our models surpass existing state-of-the-art results for both Intent as well as Slot prediction on two open datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 753–757 (2018)
Zhang, X., Wang, H.: A joint model of intent determination and slot filling for spoken language understanding. In: IJCAI, pp. 2993–2999 (2016)
Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv:1609.01454 (2016)
Wang, Y., Shen, Y., Jin, H.: A Bi-model based RNN semantic frame parsing model for intent detection and slot filling. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 309–314 (2018)
Wang, Y., Tang, L., He, T.: Attention-based CNN-BLSTM networks for joint intent detection and slot filling. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds.) CCL/NLP-NABD -2018. LNCS (LNAI), vol. 11221, pp. 250–261. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01716-3_21
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of EMNLP 2014 Conference, pp. 1746–1751 (2014)
Kim, Y., Lee, S., Stratos, K.: OneNet: joint domain, intent, slot prediction for spoken language understanding. In: Automatic Speech Recognition and Understanding Workshop IEEE, pp. 547–553. IEEE (2017)
Zhou., C., Sun, C., Liu, Z., Lau, F.C.M.: A C-LSTM neural network for text classification. arXiv:1511.08630 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI, vol. 333, pp. 2267–2273 (2015)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv:1605.05101 (2016)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991 (2015)
Hakkani-Tür, D., Tur, G., Celikyilmaz, A., Chen, Y.N., Deng, L., Wang, Y.-Y.: Multi-domain joint semantic frame parsing using Bi-directional RNN-LSTM. In: Interspeech (2016)
Kurata, G., Xiang, B., Zhou, B., Yu, M.: Leveraging sentence-level information with encoder LSTM for semantic slot filling. arXiv:1601.01530 (2016)
Shi., Y., Yao, K., Chen, H., Yu, D., Pan, Y.-C., Hwang, M.-Y.: Recurrent support vector machines for slot tagging in spoken language understanding. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 393–399 (2016)
Raymond, C., Riccardi, G.: Generative and discriminative algorithms for spoken language understanding. In: International Speech Communication Association (2007)
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bhasin, A., Natarajan, B., Mathur, G., Jeon, J.H., Kim, JS. (2019). Unified Parallel Intent and Slot Prediction with Cross Fusion and Slot Masking. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-23281-8_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23280-1
Online ISBN: 978-3-030-23281-8
eBook Packages: Computer ScienceComputer Science (R0)