Abstract
Auto-encoders constitute one popular deep learning architecture for feature extraction. Since an auto-encoder has at least one bottle neck layer for feature representation and at least five layers for fitting nonlinear transformations, back-propagation learning (BPL) algorithms with saturated activation functions sometimes face the vanishing gradient problem, which slows convergence. Thus, several modified methods have been proposed to mitigate this problem. In this work, we propose the calculation of forward-propagated errors in parallel with back-propagated errors in the network, without modification of the activation functions or the network structure. Although this scheme for auto-encoder learning has a larger computational cost than that of BPL, processing time until convergence could be reduced by implementing parallel computing. In order to confirm the feasibility of this scheme, two simple problems were examined by training auto-encoders to acquire (1) identity mappings of two-dimensional points along the arc of a half-circle to extract the central angle and (2) hand-writing images to extract labeled digits. Both results indicate that the proposed scheme requires only about half of the iterations to reduce the cost value enough, compared to BPL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521, 436–444 (2015)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in reccurent nets: the difficulty of learning long-term dependencies. In: Kremer, C., Kolen, J.F. (eds.) Field Guide to Dynamical Recurrent Neural Networks, pp. 237–244. Wily-IEEE Press, Hoboken (2001)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway Networks. arXiv:1505.00387 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. arXiv:1512.03385 (2015)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: 27th International Conference on Machine Learning, pp. 807–814. Omnipress, Madison (2010)
Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: 13th International Conference on Neural Information Processing Systems, pp. 451–457. MIT Press, Denver (2001)
Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 257–269 (2010)
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (2014)
Crick, F.: The recent excitement about neural networks. Nature 337, 129–132 (1987)
Harris, K.D.: Stability of the fittest: organizing learning through retroaxonal signals. Trends Neurosci. 31, 130–136 (2008)
Werfel, J., Xie, X., Seung, H.S.: Learning curves for stochastic gradient decent in linear feedforward networks. Neural Comput. 17, 2699–2718 (2005)
Ohama, Y., Fukumura, N., Uno, Y.: A Simplified Forward-Propagation Learning Rule Applied to Adaptive Closed-Loop Control. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 437–443. Springer, Heidelberg (2005). doi:10.1007/11550907_69
Bengio, Y.: How Auto-encoders could Provide Credit Assignment in Deep Networks via Target Propagation. arXiv:1407.7906. (2014)
Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Comm. 7, 13276 (2016)
Nøkland, A.: Direct feedback alignment provides learning in deep neural networks. In: 30th International Conference on Neural Information Processing Systems, pp. 1037–1045. MIT Press, Denver (2016)
The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ohama, Y., Yoshimura, T. (2017). A Parallel Forward-Backward Propagation Learning Scheme for Auto-Encoders. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-70096-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)