Abstract
The paper involves a study on the application of spiking neural network (SNN), which is also known as the third generation of neural networks for automatic bird species recognition. Spiking neural network has worked incredibly well in Neuromorphic computing, such as sequence identification, character identification, etc. This neural network architecture is a consistent model, which comprises of sensory encrypt, training and decrypt functional parts. This decrypt layer utilizes an algorithm based on the normalized approximate gradient descent (NormAD) for synaptic weight adjustment. The result reflected in this paper is for processing spatiotemporal pattern, and has higher computational power over perceptron, along with biological plausibility of spiking neurons. The proposed representation uses segmentation of audio frame based on the energy of the bird call separating voiced, unvoiced and a silence portion of it. This is passed to the attribute extrication method i.e. Mel frequency Cepstral coefficient and the by-product is thus fed to the SNN system for bird species classification. The research output recognizes 14 bird species with 90.10% accuracy in comparison to the perceptron system with an accuracy of 78%.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
And, Rabiner L. R., & Sambur, M. R. (1975). An algorithm for determining the endpoints of isolated utterances. Journal of the Acoustical Society of America, 54(2), 297–315.
Anwani, N., & Rajendran, B. (2015). NormAD-normalized approximate descent based supervised learning rule for spiking neurons. In International joint conference on neural networks, July 2015, pp. 1–8.
Cao, Y., Chen, Y., & Khosla, D. (2015). Spiking deep convolutional neural networks for energy-efficient object recognition. International Journal of Computer Vision, 113(1), 54–66.
Diehl, P. U., et al. (2015). Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In 2015 International joint conference on neural networks (IJCNN), July 2015, pp. 1–8.
Fagerlund, S., & Harma, A. (2005). Parameterization of inharmonic bird sounds for automatic recognition, in 13th European Signal Processing Conference (EUSIPCO 2005). Turkey: Antalya.
Furber, S. B., Galluppi, F., Temple, S., et al. (2014). The SpiNNaker Project. Proceedings of the IEEE, 102(5), 652–665.
Gütig, R. (2014). To spike, or when to spike? Current Opinion, Neurobiology, 25, 134–139.
Gütig, R., & Sompolinsky, H. (2009). Time-warp–invariant neuronal processing. PLoS Biology, 7(7), e1000141.
Hinton, G., et al. (2012). Deep Neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.
Hopfield, J. J., & Brody, C. D. (2004). What is a moment? Transient synchrony as a collective mechanism for spatio-temporal integration. Proceedings of the National Academy of Sciences of the United States of America, 98(3), 1282.
Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 195(1), 215–243.
Hunsberger, E. (2017). Spiking deep neural networks: Engineered and biological approaches to object recognition. Ph.D. thesis, UWSpace.
Kandel, E. R., Schwartz, J. H., & Jessell, T. M. (2000). Principles of Neural Science. New York: McGraw-Hill.
Kasabov, N., et al. (2014). Evolving spiking neural networks for personalized modelling, classification, and prediction of spatiotemporal patterns with a case study on stroke. Neurocomputing, 134, 269–279.
Kogan, J., & Margoliash, D. (1998). Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study. Journal of the Acoustical Society of America, 103(4), 2185–2196.
Kulkarni, S. R., & Rajendran, B. (2018). Spiking neural networks for handwritten digit recognition—Supervised learning and network optimization. Neural Networks, 103(2018), 118–127.
Kwan, C., et al. (2006). An automated acoustic system to monitor and classify birds. EURASIP Journal on Applied Signal Processing, 2006, 52.
Lee, J. H., Delbruck, T., & Pfeiffer, M. (2016). Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience, 10, 508.
Lee, C., Han, C., & Chuang, C. (2008). Automatic classification of bird species from their sounds using two-dimensional Cepstral coefficients. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1541–1550.
Lee, W. W., Kukreja, S. L., & Thakor, N. V. (2017). CONE: Convex-optimized-synaptic efficacies for temporally precise spike mapping. IEEE Transactions on Neural Networks and Learning Systems, 28(4), 849–861.
Maass, W. (1997). Network of spiking neuron: The third generation of neural network models. Neural Network, 10(9), 1659–1671.
Mohemmed, A., et al. (2012). SPAN: Spike pattern association neuron for learning spatio-temporal spike patterns. International Journal of Neural Systems, 22(4), 1250012–1250028.
Ponulak, F., & Kasinski, A. (2012). Supervised learning in spiking neural networks with ReSuMe: Sequence learning, classification, and spike shifting. Neural Computation, 22(2), 467–510.
Rueckauer, B., Lungu, I. A., Hu, Y., et al. (2017). Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience, 11, 682.
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., & Poggio, T. (2007). Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 411–426.
Song, S., Miller, K. D., & Abbott, L. F. (2003). Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience, 3(9), 919–926.
Tapson, J., Cohen, G., Afshar, S., et al. (2013). Synthesis of neural networks for spatio-temporal spike pattern recognition and processing. Frontiers in Neuroscience, 7, 153.
Tapson, J., de Chazal, P., & van Schaik, A. (2015). Explicit computation of input weights in extreme learning machines. In J. Cao, K. Mao, E. Cambria, Z. Man, & K.-A. To (Eds.), Proceedings of ELM-2014, Algorithms and Theories Vol. 1, pp. 41–49.
Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
Acknowledgments
The authors thank Dr. Indira Nayak for help with data preparation. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mohanty, R., Mallik, B.K. & Solanki, S.S. Normalized approximate descent used for spike based automatic bird species recognition system. Int J Speech Technol 25, 57–65 (2022). https://doi.org/10.1007/s10772-020-09735-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-020-09735-6