Abstract
The problem related to speech recognition system becomes challenging if vocabularies are having too many similar-sounding words. To overcome these types of challenges, an effective speech recognition system using artificial neural network (ANN) with optimization technique is proposed. In this system, distinct words spoken by different people are considered as input speech signal. The features of these input speech signals are extracted using amplitude modulation spectrogram. The extracted features are then the input to the ANN for training. The trained ANN inputs are used for predicting the isolated words during testing. In this work, the default structure of ANN is redesigned using Levenberg–Marquardt algorithm, to retrieve optimal prediction rate with accuracy. The hidden layers and neurons of the hidden layers are further optimized using the opposition artificial bee colony optimization technique. The outcome of the system demonstrates that the sensitivity, specificity, and accuracy of the proposed technique is 90.41%, 99.66%, and 99.36%, respectively, which is better than all the existing methods.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abdel-rahman, M., George, E. D., & Geoffrey, H. (2012). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech and Language Processing,20(1), 14–22.
Albadr, M. A. A., Tiun, S., Ayob, M., & AL-Dhief, F. T. (2019). Spoken language identification based on optimised genetic algorithm–extreme learning machine approach. International Journal of Speech Technology. https://doi.org/10.1007/s10772-019-09621-w.
Ali, Z., & Talha, M. (2018). Innovative method for unsupervised voice activity detection and classification of audio segments. Special Section on Radio Frequency Identification and Security Techniques, IEEE Access,6, 15494–15504.
Ananthi, S., & Dhanalakshmi, P. (2013). Speech recognition system and isolated word recognition based on hidden markov model (HMM) for hearing impaired. International Journal of Computer Applications,73(20), 30–34.
Anusha, K. P. (2012). Determination of noise levels in using AMS features of noisy speech signal and their comparison. International Journal of Advanced Research in Computer Engineering & Technology,1(5), 75–78.
Beltran, Angelo A., Ericson, D. D., & Donde, A. D. (2015). Speaker dependent voice recognition using discrete wavelet transform. International Journal of Scientific Engineering and Technology,4(8), 443–446.
Biagetti, G., Crippa, P., Falaschetti, L., & Turchetti, C. (2018). HMM speech synthesis based on MDCT representation. International Journal of Speech Technology. https://doi.org/10.1007/s10772-018-09571-9.
Chang, H. Y., & Bin, M. A. (2017). Spectral-domain speech enhancement for speech recognition. Speech Communication,94, 30–41.
Georg, H., Hermann, N., Ralf, S., & Simon, W. (2012). Discriminative training for automatic speech recognition. IEEE Signal Processing Magazine,29(6), 58–69.
Gulin, D., & Murat, H. S. (2010). Speech recognition with artificial neural networks. Digital Signal Processing,20, 763–768.
Gupta, M., Jain, M., & Kumar, B. (2010). Novel class of stable wideband recursive digital integrators and differentiators. IET Signal Processing,4(5), 560–566.
Gupta, M., Jain, M., & Kumar, B. (2011). Recursive wideband digital integrator and differentiator. International Journal of Circuit Theory and Applications,39(7), 775–782.
Gupta, M., Jain, M., & Kumar, B. (2012). Wideband digital integrator and differentiator. IETE Journal of Research,58(2), 166–170.
Hasan, B., Alper, B., Abdullah, C., & Mehmet, E. Y. (2017). A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited–memory BFGS optimization algorithms. Neurocomputing,266, 506–526.
Ibrahim, E. H., Walid, K., Osama, E., & Al-Zahraa, A. (2014). Recognition of phonetic arabic figures via wavelet based mel frequency cepstrum using HMMs. Journal of Housing and Building National Research Center,10(1), 49–54.
Jain, M., Gupta, M., & Jain, N. (2012). Linear phase second order recursive digital integrators and differentiators. Radioengineering,21(2), 712–717.
Jain, M., Gupta, M., & Jain, N. (2013). Analysis and design of digital IIR integrators and differentiators using minimax and pole, zero and constant optimization methods. ISRN Electronics,2013, 1–14.
Jain, M., Gupta, M., & Jain, N. (2014). The design of the IIR differintegrator and its application in edge detection. Journal of Information Processing Systems,10(2), 223–239.
Jain, M., Gupta, M., & Jain, N. (2016). Design of half sample delay recursive digital integrators using trapezoidal integration rule. International Journal of Signal & Imaging Systems Engineering,9(2), 126–134.
Karaboga, D. (2006). An idea based on honey bee swarm for numerical optimization. Technical report TR06, engineering faculty, computer engineering.
Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of ICNN’95- International Conference on Neural Networks, 4, 1942–1948.
Khaled, D., & Tarek, A. T. (2015). Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Applied Soft Computing,27, 231–239.
Kuldeep, K., Aggarwal, J. A., & Ankita, J. (2011). An analysis of speech recognition performance based upon network layers and transfer functions. International Journal of Computer Science, Engineering and Applications,1(3), 11–20.
Michael, S., Dong, Y. & Yongqiang, W. (2013). An investigation of deep neural networks for noise robust speech recognition. In Proceedings of IEEE international conference on acoustics, speech and signal processing (pp. 7398–7402).
Moataz, E. A., Mohamed, K., & Fakhri, K. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition,44(3), 572–587.
Mohammed, E. A. (2011). Opposition-based artificial bee colony algorithm. In Proceedings of the genetic and evolutionary computation conference (pp. 109–115).
Nazri, M. N., Abdullah, K., & Rehman, M. Z. (2013). A new levenberg marquardt based back propagation algorithm trained with cuckoo search. Procedia Technology,11, 18–23.
Niko, M., Jorn, A. & Birger, K. (2011). Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments. In Proceedings of IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5492–5495).
Orcik, L., Voznak, M., & Rozhon, J. (2017). Prediction of speech quality based on resilient back propagation artificial neural network. Wireless Personal Communications,96, 5375–5389.
Pankaj, R., Sushil, K., & Shweta, R. (2015). Speech recognition using neural network. In IJCA Proceedings on international conference on advancements in engineering and technology (ICAET) (pp. 11–14).
Salam, M. S., Dzulkifli, M., & Sheikh, S. (2011). Malay isolated speech recognition using neural network: A work in finding number of hidden nodes and learning parameters. The International Arab Journal of Information Technology, 8(4), 364–371.
Shukla, S., Jain, M., & Dubey, R. K. (2019). Increasing the performance of speech recognition system by using different optimization techniques to redesign artificial neural network. Journal of Theoretical and Applied Information Technology, 97(8), 2404–2415.
Sigappi, A. N., & Palanivel, S. (2012). Spoken word recognition strategy for tamil language. International Journal of Computer Science Issues,9(3), 227–233.
Sina, S., & Saeed, B. S. (2018). Evaluation of a novel fuzzy sequential pattern recognition tool (fuzzy elastic matching machine) and its applications in speech and handwriting recognition. Applied Soft Computing,62, 315–327.
Tara, N. S. J., Weiss, R. J., Kevin, W. W., Bo, L., Arun, N., Ehsan, V., et al. (2017). Multichannel signal processing with deep neural networks for automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing,25(5), 965–979.
Vimala, C., & Radha, V. (2012). A review on speech recognition challenges and approaches. World of Computer Science and Information Technology Journal (WCSIT),2(1), 1–7.
Xin, M., & Weidong, Z. (2008). AMS based spectrum subtraction algorithm with confidence interval test. In Proceedings of 7th asian-pacific conference on medical and biological engineering (pp. 389–391).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shukla, S., Jain, M. A novel system for effective speech recognition based on artificial neural network and opposition artificial bee colony algorithm. Int J Speech Technol 22, 959–969 (2019). https://doi.org/10.1007/s10772-019-09639-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-019-09639-0