Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks | SpringerLink
Skip to main content

Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2022)

Abstract

A preconditioned gradient scheme for the regularized minimization problem arising from the approximation of given data by a shallow neural network is presented. The construction of the preconditioner is based on random normal projections and is adjusted to the specific structure of the regularized problem.

The convergence of the preconditioned gradient method is investigated numerically for a synthetic problem with a known local minimizer. The method is also applied to real problems from the Proben1 benchmark set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Broyden, C.G., Dennis, J.E., Jr., Moré, J.J.: On the local and superlinear convergence of quasi-Newton methods. IMA J. Appl. Math. 12(3), 223–245 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  2. Crane, R., Roosta, F.: Invexifying regularization of non-linear least-squares problems. arXiv preprint arXiv:2111.11027 (2021)

  3. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7), 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  4. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  5. Gorbunov, E., Hanzely, F., Richtárik, P.: A unified theory of SGD: variance reduction, sampling, quantization and coordinate descent. In: International Conference on Artificial Intelligence and Statistics, pp. 680–690. PMLR (2020)

    Google Scholar 

  6. Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hanke-Bourgeois, M.: Grundlagen der numerischen Mathematik und des wissenschaftlichen Rechnens, 3rd edn. Vieweg + Teubner, Wiesbaden (2009). https://doi.org/10.1007/978-3-8351-9020-7

    Book  MATH  Google Scholar 

  8. Herman, G.T., Lent, A., Hurwitz, H.: A storage-efficient algorithm for finding the regularized solution of a large, inconsistent system of equations. IMA J. Appl. Math. 25(4), 361–366 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  9. Lange, S., Helfrich, K., Ye, Q.: Batch normalization preconditioning for neural network training. arXiv preprint arXiv:2108.01110 (2021)

  10. Meng, X., Saunders, M.A., Mahoney, M.W.: LSRN: a parallel iterative solver for strongly over-or underdetermined systems. SIAM J. Sci. Comput. 36(2), C95–C118 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Onose, A., Mossavat, S.I., Smilde, H.J.H.: A preconditioned accelerated stochastic gradient descent algorithm. In: 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2020)

    Google Scholar 

  12. Paige, C.C., Saunders, M.A.: Algorithm 583: LSQR: sparse linear equations and least squares problems. ACM Trans. Math. Softw. 8(2), 195–209 (1982)

    Article  Google Scholar 

  13. Prechelt, L.: Proben1: a set of neural network benchmark problems and benchmarking rules (1994)

    Google Scholar 

  14. Qiao, Y., Lelieveldt, B.P., Staring, M.: An efficient preconditioner for stochastic gradient descent optimization of image registration. IEEE Trans. Med. Imaging 38(10), 2314–2325 (2019)

    Article  Google Scholar 

  15. Vater, N., Borzì, A.: Training artificial neural networks with gradient and coarse-level correction schemes. In: Nicosia, G., et al. (eds.) International Conference on Machine Learning, Optimization, and Data Science, pp. 473–487. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-95467-3_34

  16. Zhang, J., Fattahi, S., Zhang, R.: Preconditioned gradient descent for over-parameterized nonconvex matrix factorization. Adv. Neural Inf. Process. Syst. 34 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nadja Vater .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vater, N., Borzì, A. (2023). Preconditioned Gradient Method for Data Approximation with Shallow Neural Networks. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25891-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25890-9

  • Online ISBN: 978-3-031-25891-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics