Abstract
Symmetry, a central concept in understanding the laws of nature, has been used for centuries in physics, mathematics, and chemistry, to help make mathematical models tractable. Yet, despite its power, symmetry has not been used extensively in machine learning, until rather recently. In this article we show a general way to incorporate symmetries into machine learning models. We demonstrate this with a detailed analysis on a rather simple real world machine learning system - a neural network for classifying handwritten digits, lacking bias terms for every neuron. We demonstrate that ignoring symmetries can have dire over-fitting consequences, and that incorporating symmetry into the model reduces over-fitting, while at the same time reducing complexity, ultimately requiring less training data, and taking less time and resources to train.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Bag of words methods represent text by the set of words or phrases used in it.
- 2.
A feature map is a function which maps an input data vector to a vector space to be consumed directly by a machine learning model. For instance, starting with inputs \(x_{1,2}\), a feature map could generate the product \(x_1 x_2\).
- 3.
The original repository for this dataset [13] is http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits.
References
LeCun, Y., et al.: Connectionism in Perspective, pp. 143–155 (1989)
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Shape, Contour and Grouping in Computer Vision, pp. 319–345. Springer, Heidelberg (1999)
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D.: Advances in Neural Information Processing Systems, pp. 396–404 (1990)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Proc. IEEE 86(11), 2278 (1998)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616. ACM (2009)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Gens, R., Domingos, P.M.: Advances in Neural Information Processing Systems, pp. 2537–2545 (2014)
Dieleman, S., De Fauw, J., Kavukcuoglu, K.: arXiv preprint arXiv:1602.02660 (2016)
Cohen, T., Welling, M.: International Conference on Machine Learning, pp. 2990–2999 (2016)
Henriques, J.F., Vedaldi, A.: arXiv preprint arXiv:1609.04382 (2016)
Gens, R., Domingos, P.M.: Deep symmetry networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2537–2545. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5424-deep-symmetry-networks.pdf
Dheeru, D., Taniskidou, E.K.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: J. Mach. Learn. Res. 12, 2825 (2011)
Hornik, K., Stinchcombe, M., White, H.: Neural Netw. 2(5), 359 (1989)
Mehta, P., Schwab, D.J.: An exact mapping between the variational renormalization group and deep learning. arXiv preprint arXiv:1410.3831 (2014)
Novikov, D.S., Veraart, J., Jelescu, I.O., Fieremans, E.: NeuroImage 174, 518 (2018)
Nambu, Y.: Phys. Rev. 117(3), 648 (1960)
Goldstone, J.: Il Nuovo Cimento (1955–1965) 19(1), 154 (1961)
Goldstone, J., Salam, A., Weinberg, S.: Phys. Rev. 127(3), 965 (1962)
Shwartz-Ziv, R., Tishby, N.: arXiv preprint arXiv:1703.00810
Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Symposium on Geometry Processing, vol. 6, pp. 156–164 (2003)
Acknowledgments
The author would like to thank Miles Stoudenmire, Daniel Malinow, David J. Bergman, and Dmitry S. Novikov, for useful feedback on the ideas presented in this manuscript. In particular, discussions with Dmitry S. Novikov inspired exploring the fitting parameter degeneracy that occurs when symmetry is not enforced upon a fitting model. The author would also like to thank the UCI machine learning repository (http://archive.ics.uci.edu/ml/index.php) for making the dataset used in this work available.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bergman, D.L. (2020). Symmetry Constrained Machine Learning. In: Bi, Y., Bhatia, R., Kapoor, S. (eds) Intelligent Systems and Applications. IntelliSys 2019. Advances in Intelligent Systems and Computing, vol 1038. Springer, Cham. https://doi.org/10.1007/978-3-030-29513-4_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-29513-4_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29512-7
Online ISBN: 978-3-030-29513-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)