Parallel Implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks

Imakura, Akira; Inoue, Yuto; Sakurai, Tetsuya; Futamura, Yasunori

doi:10.1007/s11063-017-9642-2

Parallel Implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks

Published: 31 May 2017

Volume 47, pages 815–827, (2018)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Akira Imakura ORCID: orcid.org/0000-0003-4994-2499¹,
Yuto Inoue¹,
Tetsuya Sakurai^1,2 &
…
Yasunori Futamura¹

502 Accesses
7 Citations
Explore all metrics

Abstract

For computing weights of deep neural networks (DNNs), the backpropagation (BP) method has been widely used as a de-facto standard algorithm. Since the BP method is based on a stochastic gradient descent method using derivatives of objective functions, the BP method has some difficulties finding appropriate parameters such as learning rate. As another approach for computing weight matrices, we recently proposed an alternating optimization method using linear and nonlinear semi-nonnegative matrix factorizations (semi-NMFs). In this paper, we propose a parallel implementation of the nonlinear semi-NMF based method. The experimental results show that our nonlinear semi-NMF based method and its parallel implementation have competitive advantages to the conventional DNNs with the BP method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Parallelized ADMM with General Objectives for Deep Learning

Accelerating Block-Circulant Matrix-Based Neural Network Layer on a General Purpose Computing Platform: A Design Guideline

Parallel and distributed asynchronous adaptive stochastic gradient methods

Article 19 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In [11], the simplified objective function (3), which discards bias vectors and sparse regularizations, was considered. To consider bias vectors and sparse regularizations, we need to construct algorithms for solving “constrained” (nonlinear) semi-NMFs with sparse regularizations, because \({\varvec{1}}\) is fixed in (1). Therefore, in this paper, we also consider the simplified objective function (3). Note that we have been developing methods for solving such constrained problems.

References

Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wisetraining of deep networks. Proc Adv Neural Inf Process Syst 19:153–160
Google Scholar
Ciresan DC, Meier U, Masci J, Gambardella LM, Schmidhuber J (2011) Flexible, high performance convolutional neural networks for image classification. Proc. 22nd International joint conference on artificial intelligence, 1237–1242
Ding D, Li T, Jordan MI (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32:45–55
Article Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In International conference on artificial intelligence and statistics, 249–256
Hinton GE, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, Senior A, Vanhoucke V (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process Mag 29:82–97
Article Google Scholar
Kingma DP Ba J (2015) ADAM: a method for stochastic optimization. The international conference on learning representations (ICLR), San Diego
LeCun Y The MNIST database of handwritten digits, http://yann.lecun.com/exdb/mnist
LeCun Y, Bottou L, Bengio Y, Huffier P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86:2278–2324
Article Google Scholar
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In Proc, ICML
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536
Article MATH Google Scholar
Sakurai T, Imakura A, Inoue Y, Futamura F (2016) Alternating optimization method based on nonnegative matrix factorizations for deep neural networks. In Proc. ICONIP2016
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
TensorFlow, https://www.tensorflow.org/

Download references

Author information

Authors and Affiliations

University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki, 305-8573, Japan
Akira Imakura, Yuto Inoue, Tetsuya Sakurai & Yasunori Futamura
JST/CREST, Kawaguchi, Japan
Tetsuya Sakurai

Authors

Akira Imakura
View author publications
You can also search for this author in PubMed Google Scholar
Yuto Inoue
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Sakurai
View author publications
You can also search for this author in PubMed Google Scholar
Yasunori Futamura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akira Imakura.

Additional information

This research was supported partly by JST/ACT-I (No. JPMJPR16U6), JST/CREST, MEXT KAKENHI (No. 17K12690) and University of Tsukuba Basic Research Support Program Type A. This research in part used computational resources of the K computer provided by the RIKEN Advanced Institute for Computational Science through the HPCI System Research project (Project ID:hp160138) and COMA provided by Interdisciplinary Computational Science Program in Center for Computational Sciences, University of Tsukuba.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Imakura, A., Inoue, Y., Sakurai, T. et al. Parallel Implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks. Neural Process Lett 47, 815–827 (2018). https://doi.org/10.1007/s11063-017-9642-2

Download citation

Published: 31 May 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11063-017-9642-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Parallel Implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parallelized ADMM with General Objectives for Deep Learning

Accelerating Block-Circulant Matrix-Based Neural Network Layer on a General Purpose Computing Platform: A Design Guideline

Parallel and distributed asynchronous adaptive stochastic gradient methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Parallel Implementation of the Nonlinear Semi-NMF Based Alternating Optimization Method for Deep Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parallelized ADMM with General Objectives for Deep Learning

Accelerating Block-Circulant Matrix-Based Neural Network Layer on a General Purpose Computing Platform: A Design Guideline

Parallel and distributed asynchronous adaptive stochastic gradient methods

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation