Abstract
Automatic recognition of isolated spoken digits is one of the most challenging tasks in the area of Automatic Speech Recognition. In this paper, Database Development and Automatic Speech Recognition of Isolated Pashto Spoken Digits from Sefer (0) to Naha (9) has been presented. A number of 50 individual Pashto native speakers (25 male and 25 female) of different ages, ranging from 18 to 60 years, were involved to utter from Sefer (0) to Naha (9) digits separately. Sony PCM-M 10 linear recorder is used for recoding purpose in the office and home in noise free environment. Adobe audition version 1.0 is used to split the audio of digits into individual digits and result is saved in .wav format. Mel frequency cepstral coefficients is used to extract speech features. K nearest neighbor classifier is used for the first time up to author knowledge in Pashto language to classify the features of speech and compare its accuracy with linear discriminate analysis. The experimental results are evaluated, and the overall average recognition exactitude of 76.8 % is obtained.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abbas, A. W., Ahmad, N., & Ali, H. (2012). Pashto spoken digits database for the automatic speech recognition research. In 18th IEEE international conference on automation and computing (ICAC), 2012 (pp. 1–5).
Abdur, S., Abid, R., Ahmad, N., Khan, M. A. A., & Zuhra, F. T. (2013). Concatenative based Pashto digits and numbers synthesizer. International Journal of Computer Applications, 72(6), 38–42.
Ádám, N. A. (2014). Speech analysis system based on vector quantization using the LBG algorithm and self-organizing maps. International Journal of Computer and Information Technology, 3(5), 952–957.
Alotaibi, Y. A. (2003). High performance Arabic digits recognizer using neural networks. In 2003 IEEE proceedings of the international joint conference on neural networks (Vol. 1, pp. 670–674).
Alcaraz Meseguer, N. (2009). Speech analysis for automatic speech recognition. Department of Electronics and Telecommunications, Norwegian University of Science and Technology (Thesis).
Halpern, J. (2007). The challenges and pitfalls of Arabic romanization and arabization. In Proceedings of the workshop on computational approaches to Arabic script based language.
Han, J., & Kamber, M. (2006). Data mining, Southeast Asia edition: Concepts and techniques (pp. 263–264). Burlington: Morgan Kaufmann.
Jan, Z., Abrar, M., Bashir, S., & Mirza, A. M. (2009). Seasonal to inter-annual climate prediction using data mining KNN technique. In Wireless networks, information processing and systems (pp. 40–51). Berlin: Springer.
Karpagavalli, S., Rani, K. U., Deepika, R., & Kokila, P. (2012). Isolated Tamil digits speech recognition using vector quantization. Paper presented at the International Journal of Engineering Research and Technology. PSGR Krishnammal College for Women, Coimbatore, 1(4), June 2012.
Majeed, S. A., Husain, H., Samad, S. A., & Hussain, A. (2012). Hierarchical K-means algorithm applied on isolated Malay digit speech recognition. In International proceedings of computer science & information technology (Vol. 34).
Muhammad, G., Alotaibi, Y. A., & Huda, M. N. (2009). Automatic speech recognition for Bangla digits. In 12th IEEE international conference on computers and information technology, 2009, (ICCIT’09) (pp. 379–383).
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083.
Pei, J. I. A. (2010). Automatic speech recognition. London: Springer.
Poonkuzhali, C., Karthiprakash, R., Valarmathy, S., & Kalamani, M. (2013). An approach to feature selection algorithm based on ant colony optimization for automatic speech recognition. pp. 5671–5678.
Prasad, R., Tsakalidis, S., Bulyko, I., Kao, C. L., & Natarajan, P. (2010). Pashto speech recognition with limited pronunciation lexicon. In 2010 IEEE international conference on acoustics speech and signal processing (ICASSP) (pp. 5086–5089).
Shah, F. (2010). Isolated Malayalam digit recogntion using Support Vector Machines. In 2010 international conference on communication control and computing technologies (pp. 692–695).
Sheena, C. V., Thasleema, T. M., & Narayanan, N. K. (2013). Search time reduction using hidden markov models for isolated digit recognition (pp. 33–38). Department of Information Technology, Kannur University, Kerala.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ali, Z., Abbas, A.W., Thasleema, T.M. et al. Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN. Int J Speech Technol 18, 271–275 (2015). https://doi.org/10.1007/s10772-014-9267-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-014-9267-z