Abstract
It has been shown that “visual numerosity emerges as a statistical property of images in ‘deep networks’ that learn a hierarchical generative model of the sensory input”, through unsupervised deep learning [1]. The original deep generative model was based on stochastic neurons and, more importantly, on input (image) reconstruction. Statistical analysis highlighted a correlation between the numerosity present in the input and the population activity of some neurons in the second hidden layer of the network, whereas population activity of neurons in the first hidden layer correlated with total area (i.e., number of pixels) of the objects in the image. Here we further investigate whether numerosity information can be isolated as a disentangled factor of variation of the visual input. We train in unsupervised and semi-supervised fashion a latent-space generative model that has been shown capable of disentangling relevant semantic features in a variety of complex datasets, and we test its generative performance under different conditions. We then propose an approach to the problem based on the assumption that, in order to let numerosity emerge as disentangled factor of variation, we need to cancel out the sources of variation at graphical level.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
It is worth clarifying that for each component of the cost functions shown in all the equations, for all the three models considered, we apply a weighting hyper-parameter (thus, not only for the Information based reguliarized of the InfoGAN model), and we investigate empirically the effect of changing them.
- 2.
In our first setup to investigate this model we used, as second dataset, the labels themselves, feeding one line of the model with the labels and the other line with images. It must be noted however that this approach can be extended to a setup that does not use labels at all, however we leave this for future developments.
- 3.
Even when the Categorical dimensionality is somehow compatible with the numerosity being analized, for example with numerosity 5 and Categorical dimension 5, or Categorical dimension 10 to account for 8 quantities and 2 possible graphical expressions, w/b or b/w.
References
Stoianov, I., Zorzi, M.: Emergence of a ‘visual number sense’ in hierarchical generative models. Nat. NeuroscI. 15(2), 194–196 (2012)
Feigenson, L., Dehaene, S., Spelke, E.: Core systems of number. Trends Cogn. Sci. 8(7), 307–314 (2004)
Zorzi, M., Testolin, A.: An emergentist perspective on the origin of number sense. Philos. Trans. Royal Soc. B Biol. Sci. 373(1740) (2018)
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (2016), arXiv:1606.03657
Wu, X., Zhang, X., Shu,X.: Cognitive Deficit of Deep Learning in Numerosity (2018), arXiv:1802.05160
Chen, S.Y., Zhou, Z., Fang, M., McClelland, J.L.: Can Generic Neural Networks Estimate Numerosity Like Humans? (2014)
Locatello, F., Bauer, S., Lucic, M., Gelly, S., Schölkopf, B., Bachem, O.: Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations (2018), arXiv:1811.12359
Zhao, S., Ren, H., Yuan, A., Song, J., Goodman, N., Ermon, S.: Bias and Generalization in Deep Generative Models: An Empirical Study arXiv:1811.03259v1 (2018)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Goodfellow, I., et al.: Generative Adversarial Networks (2014), arXiv:1406.2661
Springenberg,J.: Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks (2015), arXiv:1511.06390
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved Techniques for Training GANs (2016), arXiv:1606.03498
Barratt, S., Sharma, R.: A Note on the Inception Score (2018), arXiv:1801.01973
Katrina E., Drozdov, A.: Understanding Mutual Information and its Use in InfoGAN (2016)
Hill, F., Santoro, A., Barrett, D., Morcos, A., Lillicrap,T.: Learning to make analogies by contrasting abstract relational structure (2019), arXiv:1902.00120v1
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV (2015)
https://github.com/lukedeo/keras-acgan/blob/master/acgan-analysis.ipynb
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zanetti, A., Testolin, A., Zorzi, M., Wawrzynski, P. (2019). Numerosity Representation in InfoGAN: An Empirical Study. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-20518-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20517-1
Online ISBN: 978-3-030-20518-8
eBook Packages: Computer ScienceComputer Science (R0)