On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation

ten Bosch, Louis; Driesen, Joris; Van hamme, Hugo; Boves, Lou

doi:10.1007/978-3-642-04208-9_44

Louis ten Bosch²¹,
Joris Driesen²¹,
Hugo Van hamme²¹ &
…
Lou Boves²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5729))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

854 Accesses
2 Citations

Abstract

The discovery of words by young infants involves two interrelated processes: (a) the detection of recurrent word-like acoustic patterns in the speech signal, and (b) cross-modal association between auditory and visual information. This paper describes experimental results obtained by a computational model that simulates these two processes. The model is able to build word-like representations on the basis of multimodal input data (stimuli) without the help of an a priori specified lexicon. Each input stimulus consists of a speech signal accompanied by an abstract visual representation of the concepts referred to in the speech signal. In this paper we investigate how internal representations generalize across speakers. In doing so, we also analyze the cognitive plausibility of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Modelling Human Word Learning and Recognition Using Visually Grounded Speech

Article Open access 15 October 2022

Language and perception: Introduction to the Special Issue “Speakers and Listeners in the Visual World”

Article Open access 14 October 2019

Gradient and categorical patterns of spoken-word recognition and processing of phonetic details

Article 27 February 2019

References

Bellegarda, J.R.: Exploiting Latent Semantic Information for Statistical Language Modeling. Proc. IEEE 88, 1279–1296 (2000)
Google Scholar
Van hamme, H.: Integration of Asynchronous Knowledge Sources in a Novel Speech Recognition Framework, ISCA ITRW, Speech Analysis and Processing for Knowledge Discovery (2008)
Google Scholar
ten Bosch, L., Van Hamme, H., Boves, L.: Unsupervised detection of words - questioning the relevance of segmentation. In: ISCA ITRW, Speech Analysis and Processing for Knowledge Discovery (2008)
Google Scholar
ten Bosch, L., Boves, L.: Language acquisition: The emergence of words from multimodal input. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 261–268. Springer, Heidelberg (2008)
Chapter Google Scholar
ten Bosch, L., Van Hamme, H., Boves, L.: Discovery of words: Towards a computational model of language acquisition. In: Mihelic, F., Zibert, J. (eds.) Speech Recogition: Technologies and Applications, pp. 205–224. I-Tech Education and Publishing KG, Vienna (2008)
Google Scholar
ten Bosch, L., Van Hamme, H., Boves, L.: A computational model of language acquisition: focus on word discovery. In: Proc. Interspeech 2008, pp. 2570–2573 (2008)
Google Scholar
Boves, L., ten Bosch, L., Moore, R.: ACORNS - towards computational modeling of communication and recognition skills. In: Proceedings IEEE-ICCI 2007 (2007)
Google Scholar
Driesen, J., Van Hamme, H.: personal communication
Google Scholar
Goldinger, S.D.: Echoes of echoes? An episodic theory of lexical access. Psychological Review 105, 251–279 (1998)
Article Google Scholar
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)
Google Scholar
Houston, D.M., Jusczyk, P.W.: The role of talker-specific information in word segmentation by infants. Journal of Experimental Psychology: Human Perception & Performance 26, 1570–1582 (2000)
Google Scholar
Jusczyk, P.W., Aslin, R.N.: Infants’ detection of the sound patterns of words in fluent speech. Cognitive Psychology 29, 1–23 (1995)
Article Google Scholar
Kuhl, P.K.: Early language acquisition: cracking the speech code. Nat. Rev. Neuroscience 5, 831–843 (2004)
Article Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, vol. 13 (2001)
Google Scholar
Lippmann, R.: Speech Recognition by Human and Machines. Speech Communication 22, 1–14 (1997)
Article Google Scholar
McQueen, J.M., Cutler, A., Norris, D.: Phonological abstraction in the mental lexicon. Cognitive Science 30, 1113–1126 (2006)
Article Google Scholar
Newman, R.S.: The level of detail in infants’ word learning. Current directions in Psychological Science 17(3), 229–232 (2008)
Article Google Scholar
Roy, D.K., Pentland, A.P.: Learning words from sights and sounds: a computational model. Cognitive Science 26, 113–146 (2002)
Article Google Scholar
Singh, L., Morgan, J.L., White, K.S.: Preference and processing: The role of speech affect in early spoken word recognition. Journal of Memory and Language 51, 173–189 (2004)
Article Google Scholar
Smith, L., Yu, C.: Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106(2008), 1558–1568 (2008)
Article Google Scholar
Sroka, J.J., Braida, L.D.: Human and machine consonant recognition. Speech Communication 44, 401–423 (2005)
Article Google Scholar
Stouten, V., Demuynck, K.: Van hamme, H.: Automatically Learning the Units of Speech by Non-negative Matrix Factorisation. In: Interspeech 2007, Antwerp, Belgium (2007)
Google Scholar
http://www.acorns-project.org
http://www.sci.sdsu.edu/cdi/

Download references

Author information

Authors and Affiliations

Dept Language and Speech, Radboud University Nijmegen, NL, ESAT, Katholieke Universiteit Leuven, Belgium
Louis ten Bosch, Joris Driesen, Hugo Van hamme & Lou Boves

Authors

Louis ten Bosch
View author publications
You can also search for this author in PubMed Google Scholar
Joris Driesen
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Van hamme
View author publications
You can also search for this author in PubMed Google Scholar
Lou Boves
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Wet Bohemia at Pilsen, Czech Republic
Václav Matoušek
Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

ten Bosch, L., Driesen, J., Van hamme, H., Boves, L. (2009). On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_44

Download citation

DOI: https://doi.org/10.1007/978-3-642-04208-9_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics