Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks

Vater, Nadja; Borzì, Alfio

doi:10.1007/978-3-031-25891-6_27

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13811))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

Abstract

A preconditioned gradient scheme for the regularized minimization problem arising from the approximation of given data by a shallow neural network is presented. The construction of the preconditioner is based on random normal projections and is adjusted to the specific structure of the regularized problem.

The convergence of the preconditioned gradient method is investigated numerically for a synthetic problem with a known local minimizer. The method is also applied to real problems from the Proben1 benchmark set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 10295; Price includes VAT (Japan)

Softcover Book: JPY 12869; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Approximation results for Gradient Flow Trained Shallow Neural Networks in 1d

Article 14 November 2024

The Global Optimization Geometry of Shallow Linear Neural Networks

Article 31 May 2019

Optimal Rates of Approximation by Shallow ReLU\(^k\) Neural Networks and Applications to Nonparametric Regression

Article Open access 26 February 2024

References

Broyden, C.G., Dennis, J.E., Jr., Moré, J.J.: On the local and superlinear convergence of quasi-Newton methods. IMA J. Appl. Math. 12(3), 223–245 (1973)
Article MathSciNet MATH Google Scholar
Crane, R., Roosta, F.: Invexifying regularization of non-linear least-squares problems. arXiv preprint arXiv:2111.11027 (2021)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7), 2121–2159 (2011)
MathSciNet MATH Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Gorbunov, E., Hanzely, F., Richtárik, P.: A unified theory of SGD: variance reduction, sampling, quantization and coordinate descent. In: International Conference on Artificial Intelligence and Statistics, pp. 680–690. PMLR (2020)
Google Scholar
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)
Article MathSciNet MATH Google Scholar
Hanke-Bourgeois, M.: Grundlagen der numerischen Mathematik und des wissenschaftlichen Rechnens, 3rd edn. Vieweg + Teubner, Wiesbaden (2009). https://doi.org/10.1007/978-3-8351-9020-7
Book MATH Google Scholar
Herman, G.T., Lent, A., Hurwitz, H.: A storage-efficient algorithm for finding the regularized solution of a large, inconsistent system of equations. IMA J. Appl. Math. 25(4), 361–366 (1980)
Article MathSciNet MATH Google Scholar
Lange, S., Helfrich, K., Ye, Q.: Batch normalization preconditioning for neural network training. arXiv preprint arXiv:2108.01110 (2021)
Meng, X., Saunders, M.A., Mahoney, M.W.: LSRN: a parallel iterative solver for strongly over-or underdetermined systems. SIAM J. Sci. Comput. 36(2), C95–C118 (2014)
Article MathSciNet MATH Google Scholar
Onose, A., Mossavat, S.I., Smilde, H.J.H.: A preconditioned accelerated stochastic gradient descent algorithm. In: 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2020)
Google Scholar
Paige, C.C., Saunders, M.A.: Algorithm 583: LSQR: sparse linear equations and least squares problems. ACM Trans. Math. Softw. 8(2), 195–209 (1982)
Article Google Scholar
Prechelt, L.: Proben1: a set of neural network benchmark problems and benchmarking rules (1994)
Google Scholar
Qiao, Y., Lelieveldt, B.P., Staring, M.: An efficient preconditioner for stochastic gradient descent optimization of image registration. IEEE Trans. Med. Imaging 38(10), 2314–2325 (2019)
Article Google Scholar
Vater, N., Borzì, A.: Training artificial neural networks with gradient and coarse-level correction schemes. In: Nicosia, G., et al. (eds.) International Conference on Machine Learning, Optimization, and Data Science, pp. 473–487. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-95467-3_34
Zhang, J., Fattahi, S., Zhang, R.: Preconditioned gradient descent for over-parameterized nonconvex matrix factorization. Adv. Neural Inf. Process. Syst. 34 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Mathematik, Universität Würzburg, Würzburg, Germany
Nadja Vater & Alfio Borzì

Authors

Nadja Vater
View author publications
You can also search for this author in PubMed Google Scholar
Alfio Borzì
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nadja Vater .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Gabriele La Malfa
University of Florida, Gainesville, FL, USA
Panos Pardalos
Free University of Bozen-Bolzano, Bolzano, Italy
Giuseppe Di Fatta
University of Catania, Catania, Italy
Giovanni Giuffrida
Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vater, N., Borzì, A. (2023). Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-25891-6_27
Published: 10 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25890-9
Online ISBN: 978-3-031-25891-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Approximation results for Gradient Flow Trained Shallow Neural Networks in 1d

The Global Optimization Geometry of Shallow Linear Neural Networks

Optimal Rates of Approximation by Shallow ReLU\(^k\) Neural Networks and Applications to Nonparametric Regression

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Approximation results for Gradient Flow Trained Shallow Neural Networks in 1d

The Global Optimization Geometry of Shallow Linear Neural Networks

Optimal Rates of Approximation by Shallow ReLU\(^k\) Neural Networks and Applications to Nonparametric Regression

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation