Abstract
User satisfaction is often considered as the objective that should be achieved by spoken dialogue systems. This is why the reward function of Spoken Dialogue Systems (SDS) trained by Reinforcement Learning (RL) is often designed to reflect user satisfaction. To do so, the state space representation should be based on features capturing user satisfaction characteristics such as the mean speech recognition confidence score for instance. On the other hand, for deployment in industrial systems there is a need for state representations that are understandable by system engineers. In this article, we propose to represent the state space using a Genetic Sparse Distributed Memory. This is a state aggregation method computing state prototypes which are selected so as to lead to the best linear representation of the value function in RL. To do so, previous work on Genetic Sparse Distributed Memory for classification is adapted to the Reinforcement Learning task and a new way of building the prototypes is proposed. The approach is tested on a corpus of dialogues collected with an appointment scheduling system. The results are compared to a grid-based linear parametrisation. It is shown that learning is accelerated and made more memory efficient. It is also shown that the framework is scalable in that it is possible to include many dialogue features in the representation, interpret the resulting policy and identify the most important dialogue features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, An introduction (1998)
Levin, E., Pieraccini, R., Eckert, W.: Learning dialogue strategies within the Markov decision process framework. In: Proceedings of the IEEE ASRU (1997)
Singh, S., Kearns, M., Litman, D., Walker, M.: Reinforcement learning for spoken dialogue systems. In: Proceedings of the NIPS (1999)
Williams, J.D., Young, S.: Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 21, 231–422 (2007)
Laroche, R., Putois, G., Bretier, P.: Optimising a handcrafted dialogue system design. In: Proceedings of the Interspeech (2010)
Daubigney, L., Geist, M., Chandramohan, S., Pietquin, O.: A comprehensive reinforcement learning framework for dialogue management optimisation. IEEE J. Sel. Top. Sign. Proc. 6(8), 891–902 (2012)
Dybkjaer, L., Bernsen, N.O., Minker, W.: Evaluation and usability of multimodal spoken language dialogue systems. Speech Commun. 43, 33–54 (2004)
Lemon, O., Pietquin, O.: Data-Driven Methods for Adaptive Spoken Dialogue Systems. Springer (2012)
Walker, M., Hindle, D., Fromer, J., Fabbrizio, G., Mestel, C.: Evaluating competing agent strategies for a voice e-mail agent. In: Proceedings of the EuroSpeech (1997)
Schmitt, A., Schatz, B., Minker, W.: Modeling and predicting quality in spoken human-computer interaction. In: Proceedings of the SIGDIAL (2011)
El Asri, L., Khouzaimi, H., Laroche, R., Pietquin, O.: Ordinal regression for interaction quality prediction. In: Proceedings of the ICASSP (to be published) (2014)
Larsen, L.B.: Issues in the evaluation of spoken dialogue systems using objective and subjective measures. In: Proceedings of the IEEE ASRU, pp. 209–214 (2003)
Walker, M.A., Langkilde-Geary, I., Hastie, H.W., Wright, J., Gorin, A.: Automatically training a problematic dialogue predictor for a spoken dialogue system. J. Artif. Intell. Res. 16, 293–319 (2002)
Paek, T., Pieraccini, R.: Automating spoken dialogue management design using machine learning: An industry perspective. Speech Commun. 50 (2008)
Paek, T., Chickering, D.M.: The markov assumption in spoken dialogue management. In: Proceedings of the SIGdial Workshop on Discourse and Dialogue, pp. 35–44 (2005)
Geist, M., Pietquin, O.: Algorithmic survey of parametric value function approximation. IEEE Trans. Neural Netw.ne Learn. Syst. (2013)
Gordon, G.J.: Stable function approximation in dynamic programming. In: Proceedings of the ICML (1995)
Tsitsiklis, J., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Trans. Autom. Control (1997)
Gordon, G.J.: Reinforcement learning with function approximation converges to a region. In: Proceedings of the NIPS (2001)
Li, L., Williams, J.D., Balakrishnan, S.: Reinforcement learning for dialog management using least-squares policy iteration and fast feature selection. In: Proceedings of the Interspeech (2009)
Chandramohan, S., Geist, M., Pietquin, O.: Sparse approximate dynamic programming for dialog management. In: Proceedings of the SIGDIAL (2010)
Kanerva, P.: Associative Neural Memories: Theory and Implementation. Oxford University Press (1993)
Broomhead, D., Lowe, D.: Multivariable functional interpolation and adaptive networks. Complex Syst. (1988)
Albus, J.S.: A theory of cerebellar function. Math. Biosci. (1971)
Singh, S., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. In: Mach. Learn. (1996)
Forbes, J.R.: Reinforcement learning for autonomous vehicles. Ph.D. thesis, University of California at Berkeley (2002)
Mahadevan, S., Maggioni, M., Guestrin, C.: Proto-value functions: a Laplacian framework for learning representation and control in markov decision processes. J. Mach. Learn. Res. (2006)
Bernstein, A., Shimkin, N.: Adaptive aggregation for reinforcement learning with efficient exploration: Deterministic domains. In: Proceedings of the COLT (2008)
Wu, C., Meleis, W.: Adaptive fuzzy function approximation for multi-agent reinforcement learning. In: Proceedings of the IEEE/WIC/ACM IAT (2009)
Baumann, M., Buning, H.K.: State aggregation by growing neural gas for reinforcement learning in continuous state spaces. In: Proceedings of the ICMLA (2011)
Baumann, M., Klerx, T., Büning, H.K.: Improved state aggregation with growing neural gas in multidimensional state spaces. In: Proceedings of the ERLARS (2012)
Rogers, D.: Weather prediction using a genetic memory. Tech. Rep., NASA (1990)
El Asri, L., Laroche, R., Pietquin, O.: DINASTI: dialogues with a negotiating appointment setting interface. In: Proceedings of the LREC (to be published) (2014)
Kanerva, P.: Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cogn. Comput. (2009)
Hely, T.A., Willshaw, D.J., Hayes, G.M.: A new approach to Kanerva’s sparse distributed memory. IEEE Trans. Neural Netw. (1997)
Rao, R.P.N., Fuentes, O.: Hierarchical learning of navigational behaviors in an autonomous robot using a predictive sparse distributed memory. Mach. Learn. (1998)
Anwar, A., Dasgupta, D., Franklin, S.: Using genetic algorithms for sparse distributed memory initializations. In: Proceedings of the GECCO (1999)
Hart, E., Ross, P.: Exploiting the analogy between immunology and sparse distributed memories: A system for clustering non-stationary data. In: Proceedings of the ICAIS (2002)
Kostiadis, K., Hu, H.: KaBaGe-RL: Kanerva based generalisation and reinforcement learning for possession football. In: Proceedings of the IEEE IROS (2001)
Ratitch, B., Precup, D.: Sparse distributed memories for on-line value-based reinforcement learning. In: Proceedings of the ECML (2004)
Rogers, D.: Statistical prediction with Kanerva’s sparse distributed memory. Tech. Rep., NASA (1989)
Das, R., Whitley, D.: Genetic sparse distributed memory. In: Proceedings of the COGANN (1992)
Rogers, D.: Using data-tagging to improve the performance of the sparse distributed memory. Tech. Rep., NASA (1988)
Kaelbling, L.P.: Learning in embedded systems. Ph.D. thesis (1990)
El Asri, L., Lemonnier, R., Laroche, R., Pietquin, O., Khouzaimi, H.: NASTIA: negotiating appointment setting interface. In: Proceedings of the LREC (to be published) (2014)
Chandramohan, S., Geist, M., Pietquin, O.: Optimizing spoken dialogue management with fitted value iteration. In: Proceedings of the Interspeech (2010)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of UAI, pp. 1022–1027 (1993)
Rieser, V., Lemon, O.: Learning and evaluation of dialogue strategies for new applications: empirical methods for optimization from small data sets. Comput. Linguist. 37 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Asri, L.E., Laroche, R., Pietquin, O. (2017). Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory. In: Jokinen, K., Wilcock, G. (eds) Dialogues with Social Robots. Lecture Notes in Electrical Engineering, vol 427. Springer, Singapore. https://doi.org/10.1007/978-981-10-2585-3_3
Download citation
DOI: https://doi.org/10.1007/978-981-10-2585-3_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2584-6
Online ISBN: 978-981-10-2585-3
eBook Packages: EngineeringEngineering (R0)