Abstract
The discovery of words by young infants involves two interrelated processes: (a) the detection of recurrent word-like acoustic patterns in the speech signal, and (b) cross-modal association between auditory and visual information. This paper describes experimental results obtained by a computational model that simulates these two processes. The model is able to build word-like representations on the basis of multimodal input data (stimuli) without the help of an a priori specified lexicon. Each input stimulus consists of a speech signal accompanied by an abstract visual representation of the concepts referred to in the speech signal. In this paper we investigate how internal representations generalize across speakers. In doing so, we also analyze the cognitive plausibility of the model.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bellegarda, J.R.: Exploiting Latent Semantic Information for Statistical Language Modeling. Proc. IEEE 88, 1279–1296 (2000)
Van hamme, H.: Integration of Asynchronous Knowledge Sources in a Novel Speech Recognition Framework, ISCA ITRW, Speech Analysis and Processing for Knowledge Discovery (2008)
ten Bosch, L., Van Hamme, H., Boves, L.: Unsupervised detection of words - questioning the relevance of segmentation. In: ISCA ITRW, Speech Analysis and Processing for Knowledge Discovery (2008)
ten Bosch, L., Boves, L.: Language acquisition: The emergence of words from multimodal input. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 261–268. Springer, Heidelberg (2008)
ten Bosch, L., Van Hamme, H., Boves, L.: Discovery of words: Towards a computational model of language acquisition. In: Mihelic, F., Zibert, J. (eds.) Speech Recogition: Technologies and Applications, pp. 205–224. I-Tech Education and Publishing KG, Vienna (2008)
ten Bosch, L., Van Hamme, H., Boves, L.: A computational model of language acquisition: focus on word discovery. In: Proc. Interspeech 2008, pp. 2570–2573 (2008)
Boves, L., ten Bosch, L., Moore, R.: ACORNS - towards computational modeling of communication and recognition skills. In: Proceedings IEEE-ICCI 2007 (2007)
Driesen, J., Van Hamme, H.: personal communication
Goldinger, S.D.: Echoes of echoes? An episodic theory of lexical access. Psychological Review 105, 251–279 (1998)
Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)
Houston, D.M., Jusczyk, P.W.: The role of talker-specific information in word segmentation by infants. Journal of Experimental Psychology: Human Perception & Performance 26, 1570–1582 (2000)
Jusczyk, P.W., Aslin, R.N.: Infants’ detection of the sound patterns of words in fluent speech. Cognitive Psychology 29, 1–23 (1995)
Kuhl, P.K.: Early language acquisition: cracking the speech code. Nat. Rev. Neuroscience 5, 831–843 (2004)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, vol. 13 (2001)
Lippmann, R.: Speech Recognition by Human and Machines. Speech Communication 22, 1–14 (1997)
McQueen, J.M., Cutler, A., Norris, D.: Phonological abstraction in the mental lexicon. Cognitive Science 30, 1113–1126 (2006)
Newman, R.S.: The level of detail in infants’ word learning. Current directions in Psychological Science 17(3), 229–232 (2008)
Roy, D.K., Pentland, A.P.: Learning words from sights and sounds: a computational model. Cognitive Science 26, 113–146 (2002)
Singh, L., Morgan, J.L., White, K.S.: Preference and processing: The role of speech affect in early spoken word recognition. Journal of Memory and Language 51, 173–189 (2004)
Smith, L., Yu, C.: Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106(2008), 1558–1568 (2008)
Sroka, J.J., Braida, L.D.: Human and machine consonant recognition. Speech Communication 44, 401–423 (2005)
Stouten, V., Demuynck, K.: Van hamme, H.: Automatically Learning the Units of Speech by Non-negative Matrix Factorisation. In: Interspeech 2007, Antwerp, Belgium (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
ten Bosch, L., Driesen, J., Van hamme, H., Boves, L. (2009). On a Computational Model for Language Acquisition: Modeling Cross-Speaker Generalisation. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-04208-9_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)