Abstract
The statistical asymptotic theory is often used in theoretical results in computational and statistical learning theory. It describes the limiting distribution of the maximum likelihood estimator (MLE) as an normal distribution. However, in layered models such as neural networks, the regularity condition of the asymptotic theory is not necessarily satisfied. The true parameter is not identifiable, if the target function can be realized by a network of smaller size than the size of the model. There has been little known on the behavior of the MLE in these cases of neural networks. In this paper, we analyze the expectation of the generalization error of three-layer linear neural networks, and elucidate a strange behavior in unidentifiable cases. We show that the expectation of the generalization error in the unidentifiable cases is larger than what is given by the usual asymptotic theory, and dependent on the rank of the target function.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hagiwara, K., Toda, N., Usui, S.: On the problem of applying AIC to determine the structure of a layered feed-forward neural network. Proc. 1993 Intern. Joint Conf. Neural Networks (1993) 2263–2266
Fukumizu, K.: Special statistical properties of neural network learning. Proc. 1997 Intern. Symp. Nonlinear Theory and its Applications (NOLTA’97) (1997) 747–750
Fukumizu, K.: A Regularity condition of the information matrix of a multilayer perceptron network. Neural Networks 9 (1996) 871–879
Reinsel, G.C., Velu, R.P.: Multivariate Reduced Rank Regression. Springer-Verlag, Berlin Heidelberg New York (1998)
Baldi, P. F., Hornik, K.: Learning in linear neural networks: a survey. IEEE Trans. Neural Networks 6 (1995) 837–858
Watcher, K.W.: The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Prob. 6 (1978) 1–18
Kato, T.: Perturbation Theory for Linear Operators (2nd ed). Springer-Verlag, Berlin Heidelberg New York (1976)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fukumizu, K. (1999). Generalization Error of Linear Neural Networks in Unidentifiable Cases. In: Watanabe, O., Yokomori, T. (eds) Algorithmic Learning Theory. ALT 1999. Lecture Notes in Computer Science(), vol 1720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46769-6_5
Download citation
DOI: https://doi.org/10.1007/3-540-46769-6_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66748-3
Online ISBN: 978-3-540-46769-4
eBook Packages: Springer Book Archive