Abstract
Task-oriented dialogue systems depend on dialogue state tracking to keep track of the intentions of users in the course of conversations. Recent studies in dialogue state tracking have achieved good performance, although the great majority of them do not consider slot correlation and just predict the value of every slot separately. In this work, we propose an efficient slot correlation learning network that can capture the correlations among slots as precisely as possible. Specifically, a BERT-base-uncased encoder is first applied to encode the dialogue context, slot names and their corresponding values. Second, we design a cross multi-head attention module to calculate and fuse attention among dialogue context embedding, slot name embedding and corresponding value embedding, which extracts relevant features and provides them to other components to fully catch the slot-specific information of every slot. Finally, a transformer encoder module is used to catch the correlations among slots. Experimental results on MultiWOZ 2.0, MultiWOZ 2.1, and MultiWOZ 2.4 datasets demonstrate the effectiveness of our approach with 55.14%, 57.22% and 76.93% joint goal accuracy, respectively, which achieves new state-of-the-art performance.






Similar content being viewed by others
Availability of data and materials
The MultiWOZ 2.0 dataset analyzed during the current study is available at https://www.repository.cam.ac.uk/bitstream/handle/1810/280608/MULTIWOZ2.zip?sequence=3 &isAllowed=y, the MultiWOZ 2.1 dataset is available at https://www.repository.cam.ac.uk/bitstream/handle/1810/294507/MULTIWOZ2.1.zip?sequence=1 &isAllowed=y, and the MultiWOZ 2.4 dataset is available at https://github.com/smartyfh/MultiWOZ2.4/blob/main/data/MULTIWOZ2.4.zip.
Notes
References
Ni J, Young T, Pandelea V et al (2023) Recent advances in deep learning based dialogue systems: a systematic survey. Artif Intell Rev 56:3055–3155. https://doi.org/10.1007/s10462-022-10248-8
Chen H, Liu X, Yin D et al (2017) A survey on dialogue systems: recent advances and new frontiers. SIGKDD Explor Newsl 19(2):25–35. https://doi.org/10.1145/3166054.3166058
Lee H, Lee J, Kim TY (2019) SUMBT: slot-utterance matching for universal and scalable belief tracking. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for computational linguistics, Florence, Italy, pp 5478–5483, https://doi.org/10.18653/v1/P19-1546
Kim S, Yang S, Kim G et al (2020) Efficient dialogue state tracking by selectively overwriting memory. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, Online, pp 567–582, https://doi.org/10.18653/v1/2020.acl-main.53
Mrkšić N, Ó Séaghdha D, Wen TH et al (2017) Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for computational linguistics, Vancouver, Canada, pp 1777–1788, https://doi.org/10.18653/v1/P17-1163
Dai Y, Yu H, Jiang Y et al (2020) A survey on dialog management: recent advances and challenges. Preprint at arXiv: 2005.02233
Balaraman V, Sheikhalishahi S, Magnini B (2021) Recent neural methods on dialogue state tracking for task-oriented dialogue systems: a survey. In: Proceedings of the 22nd annual meeting of the special interest group on discourse and dialogue. Association for computational linguistics, Singapore and Online, pp 239–251
Jacqmin L, Rojas Barahona LM, Favre B (2022) Do you follow me?: a survey of recent approaches in dialogue state tracking. In: Proceedings of the 23rd annual meeting of the special interest group on discourse and dialogue. Association for computational linguistics, Edinburgh, UK, pp 336–350
Shan Y, Li Z, Zhang J et al (2020) A contextual hierarchical attention network with adaptive objective for dialogue state tracking. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, Online, pp 6322–6333, https://doi.org/10.18653/v1/2020.acl-main.563
Heck M, van Niekerk C, Lubis N et al (2020) TripPy: a triple copy strategy for value independent neural dialog state tracking. In: Proceedings of the 21st annual meeting of the special interest group on discourse and dialogue. Association for computational linguistics, 1st virtual meeting, pp 35–44
Zhu S, Li J, Chen L et al (2020) Efficient context and schema fusion networks for multi-domain dialogue state tracking. In: Findings of the association for computational linguistics: EMNLP 2020. Association for computational linguistics, Online, pp 766–781, https://doi.org/10.18653/v1/2020.findings-emnlp.68
Hu J, Yang Y, Chen C et al (2020) SAS: dialogue state tracking via slot attention and slot information sharing. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, Online, pp 6366–6375, https://doi.org/10.18653/v1/2020.acl-main.567
Chen L, Lv B, Wang C et al (2020) Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 7521–7528, https://doi.org/10.1609/aaai.v34i05.6250
Feng Y, Lipani A, Ye F et al (2022) Dynamic schema graph fusion network for multi-domain dialogue state tracking. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers). Association for computational linguistics, Dublin, Ireland, pp 115–126, https://doi.org/10.18653/v1/2022.acl-long.10
Ye F, Manotumruksa J, Zhang Q et al (2021) Slot self-attentive dialogue state tracking. In: Proceedings of the Web Conference 2021. Association for computing machinery, New York, NY, USA, WWW’21, pp 1598–1608, https://doi.org/10.1145/3442381.3449939
Budzianowski P, Wen TH, Tseng BH et al (2018) MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp 5016–5026, https://doi.org/10.18653/v1/D18-1547
Eric M, Goel R, Paul S et al (2020) MultiWOZ 2.1: A consolidated multi-domain dialogue dataset with state corrections and state tracking baselines. In: Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, pp 422–428
Ye F, Manotumruksa J, Yilmaz E (2022) MultiWOZ 2.4: A multi-domain task-oriented dialogue dataset with essential annotation corrections to improve state tracking evaluation. In: Proceedings of the 23rd annual meeting of the special interest group on discourse and dialogue. Association for computational linguistics, Edinburgh, UK, pp 351–360
Thomson B, Young S (2010) Bayesian update of dialogue state: a pomdp framework for spoken dialogue systems. Comput Speech Language 24(4):562–588. https://doi.org/10.1016/j.csl.2009.07.003
Henderson M, Thomson B, Young S (2014) Word-based dialog state tracking with recurrent neural networks. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL). Association for computational linguistics, Philadelphia, PA, U.S.A., pp 292–299, https://doi.org/10.3115/v1/W14-4340
Williams JD (2014) Web-style ranking and SLU combination for dialog state tracking. In: Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL). Association for computational linguistics, Philadelphia, PA, U.S.A., pp 282–291, https://doi.org/10.3115/v1/W14-4339
Wen TH, Vandyke D, Mrkšić N et al (2017) A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for computational linguistics, Valencia, Spain, pp 438–449
Gao S, Sethi A, Agarwal S et al (2019) Dialog state tracking: A neural reading comprehension approach. In: Proceedings of the 20th annual SIGdial meeting on discourse and dialogue. Association for computational linguistics, Stockholm, Sweden, pp 264–273, https://doi.org/10.18653/v1/W19-5932
Gao S, Agarwal S, Jin D et al (2020) From machine reading comprehension to dialogue state tracking: Bridging the gap. In: Proceedings of the 2nd workshop on natural language processing for conversational AI. Association for computational linguistics, Online, pp 79–89, https://doi.org/10.18653/v1/2020.nlp4convai-1.10
He Y, Tang Y (2021) A neural language understanding for dialogue state tracking. Knowl Sci Eng Manag. https://doi.org/10.1007/978-3-030-82136-4_44
Rastogi P, Gupta A, Chen T et al (2019) Scaling multi-domain dialogue state tracking via query reformulation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers). Association for computational linguistics, Minneapolis, Minnesota, pp 97–105, https://doi.org/10.18653/v1/N19-2013
Ren L, Ni J, McAuley J (2019) Scalable and accurate dialogue state tracking via hierarchical sequence generation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for computational linguistics, Hong Kong, China, pp 1876–1885, https://doi.org/10.18653/v1/D19-1196
Ren L, Xie K, Chen L et al (2018) Towards universal dialogue state tracking. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for computational linguistics, Brussels, Belgium, pp 2780–2786, https://doi.org/10.18653/v1/D18-1299
Zhong V, Xiong C, Socher R (2018) Global-locally self-attentive encoder for dialogue state tracking. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers). Association for computational linguistics, Melbourne, Australia, pp 1458–1467, https://doi.org/10.18653/v1/P18-1135
Mou X, Sigouin B, Steenstra I et al (2020) Multimodal dialogue state tracking by QA approach with data augmentation. Preprint at arXiv: 2007.09903
Ouyang Y, Chen M, Dai X et al (2020) Dialogue state tracking with explicit slot connection modeling. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for computational linguistics, Online, pp 34–40, https://doi.org/10.18653/v1/2020.acl-main.5
Wu CS, Madotto A, Hosseini-Asl E et al (2019) Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for computational linguistics, Florence, Italy, pp 808–819, https://doi.org/10.18653/v1/P19-1078
Yang P, Huang H, Mao XL (2020) Context-sensitive generation network for handing unknown slot values in dialogue state tracking. Preprint at arXiv: 2005.03923
Feng Y, Wang Y, Li H (2021) A sequence-to-sequence approach to dialogue state tracking. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers). Association for computational linguistics, Online, pp 1714–1725, https://doi.org/10.18653/v1/2021.acl-long.135
Devlin J, Chang MW, Lee K et al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for computational linguistics, Minneapolis, Minnesota, pp 4171–4186, https://doi.org/10.18653/v1/N19-1423
Zhu Q, Li B, Mi F et al (2022) Continual prompt tuning for dialog state tracking. In: Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: long papers). Association for computational linguistics, Dublin, Ireland, pp 1124–1137, https://doi.org/10.18653/v1/2022.acl-long.80
Lin Z, Madotto A, Winata GI et al (2020) MinTL: Minimalist transfer learning for task-oriented dialogue systems. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for computational linguistics, Online, pp 3391–3405, https://doi.org/10.18653/v1/2020.emnlp-main.273
Wang Y, He T, Mei J et al (2022) A stack-propagation framework with slot filling for multi-domain dialogue state tracking. IEEE Trans Neur Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3183081
Sun H, Bao J, Wu Y et al (2022) BORT: Back and denoising reconstruction for end-to-end task-oriented dialog. In: Findings of the association for computational linguistics: NAACL 2022. Association for computational linguistics, Seattle, United States, pp 2156–2170, https://doi.org/10.18653/v1/2022.findings-naacl.166
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, NIPS’17, pp 6000–6010
Shen T, Wang X (2020) Multi-domain dialogue state tracking with hierarchical task graph. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp 1–8, https://doi.org/10.1109/IJCNN48605.2020.9206790
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
Lei Ba J, Kiros JR, Hinton GE (2016) Layer normalization. Preprint at arXiv: 1607.06450
Chen Z, Chen L, Xu Z et al (2020) CREDIT: coarse-to-fine sequence generation for dialogue state tracking. Preprint at arXiv: 2009.10435
Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: International Conference on Learning Representations
Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: International Conference on Learning Representations, https://doi.org/10.48550/arXiv.1711.05101
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Machine Learn Res 15(1):1929–1958
Bowman SR, Vilnis L, Vinyals O et al (2016) Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Association for computational linguistics (ACL), Berlin, Germany, pp 10–21, https://doi.org/10.18653/v1/K16-1002
Acknowledgements
We would like to thank the anonymous reviewers for their useful feedback. This work is supported by three projects: The National Key Research and Development Program of China (No. 2018AAA0102100), The National Natural Science Foundation of China (No. 61976212) and Hainan Provincial Natural Science 683 Foundation of China (No. 621MS019).
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Qianyu Li. The first draft of the manuscript was written by Qianyu Li. Wensheng Zhang and Mengxing Huang contributed to the writing, editing, supervision, and funding acquisition. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethical Approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Q., Zhang, W. & Huang, M. Efficient slot correlation learning network for multi-domain dialogue state tracking. J Supercomput 79, 18547–18568 (2023). https://doi.org/10.1007/s11227-023-05217-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05217-z