A spoken query system to access the real time agricultural commodity prices and weather information in Kannada language/dialects

G, Thimmaraja Yadava; G, Nagaraja B; S, Jayanna H; R, Shivakumar B

doi:10.1007/s11042-023-16554-9

A spoken query system to access the real time agricultural commodity prices and weather information in Kannada language/dialects

Published: 07 September 2023

Volume 83, pages 28675–28688, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Thimmaraja Yadava G ORCID: orcid.org/0000-0002-3266-9732¹,
Nagaraja B G²,
Jayanna H S³ &
…
Shivakumar B R⁴

182 Accesses
1 Citation
Explore all metrics

Abstract

We develop two improvements over our previously proposed spectral subtraction with voice activity detection and minimum mean square error spectrum power estimator based on zero crossing (SS-VAD + MMSE-SPZC) enhancement for a real-time spoken query system (SQS). Firstly, we introduce a time delay neural network (TDNN) based modeling technique. Secondly, to properly train the models, we increase the size of the database by collecting the Kannada speech data from an additional 500 farmers under real-time conditions. The proposed combined enhancement technique effectively removes background noise and improves speech quality. When evaluated on the updated degraded speech corpus, our proposed automatic speech recognition (ASR) system achieves better performance compared to previous framework. Moreover, experimental results demonstrate an improvement of 1.32% and 1.48% in terms of speech recognition accuracy for noisy and enhanced speech data respectively, compared to our earlier work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Improvements in ASR system to access the real-time agricultural commodity prices and weather information in Kannada language/dialects

Article 23 May 2023

Improvements in IITG Assamese Spoken Query System: Background Noise Suppression and Alternate Acoustic Modeling

Article 05 May 2016

A spoken query system for the agricultural commodity prices and weather information access in Kannada language

Article 26 June 2017

Data Availability

https://sites.google.com/view/thimmarajayadavag/downloads

Code Availability

https://sites.google.com/view/thimmarajayadavag/downloads

References

Li J (2022) Recent advances in end-to-end automatic speech recognition, Apsipa Transactions on Signal and Information Processing 11(1)
Jainar SJ, Sale PL, Nagaraja BG (2020) VAD, feature extraction and mod- elling techniques for speaker recognition: a review. International Journal of Signal and Imaging Systems Engineering 12(1–2):1–18
Wu F, Kim K, Watanabe S, Han KJ, McDonald R, Weinberger KQ, Artzi Y (2023) Wav2seq: Pre-training speech-to-text encoder-decoder models using pseudo languages, In ICASSP IEEE International Conference on Acoustics, Speech and Signal Processing 1–5
Chang E, Seide F, Meng HM, Chen Z, Shi Y, Li YC (2002) A system for spoken query information retrieval on mobile devices. IEEE Trans Audio Speech Lang Process 10(8):531–541
Article Google Scholar
Rabiner LR (1997) Applications of speech recognition in the area of telecom- munications, IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings 501–510
Malik M, Malik MK, Mehmood K, Makhdoom I (2021) Automatic speech recognition: a survey. Multimed Tools Appl 80:9411–9457
Article Google Scholar
Zhang Y, Park DS, Han W, Qin J, Gulati A, Shor J, Jansen A, Xu Y, Huang Y, Wang S, Zhou Z (2022) Bigssl: exploring the frontier of large-scale semi-supervised learning for automatic speech recognition. IEEE J Sel Top Signal Process 16(6):15191532
Article Google Scholar
Kotkar P, Thies W, Amarasinghe S (2008) An audio wiki for publishing user- generated content in the developing world, in HCI for Community and International Development
Nagaraja BG, Jayanna HS (2013) Kannada language parameters for speaker identification with the constraint of limited data. International Journal of Image, Graphics and Signal Processing 5(9):14
Article Google Scholar
Davies M, Guenther B, Leavy J, Mitchell T, Tanner T (2009) Climate change adaptation, disaster risk reduction and social protection: complementary roles in agriculture and rural growth?. IDS Working Papers 01–37
Wu C, Li X, Guo Y, Wang J, Ren Z, Wang M, Yang Z (2022) Natural language processing for smart construction: Current status and future directions. Automation in Construction 134:104059
Article Google Scholar
Zhang Y, Han W, Qin J, Wang Y, Bapna A, Chen Z, Chen N, Li B, Axelrod V, Wang G, Meng Z (2023) Google usm: scaling automatic speech recognition beyond 100 languages, arXiv:2303.01037
Shahamiri SR (2021) Speech vision: An end-to-end deep learning-based dysarthric automatic speech recognition system. IEEE Trans Neural Syst Rehabilitation Eng 29:852–861
Article Google Scholar
Schultz BG, Tarigoppula VSA, Noffs G, Rojas S, van der Walt A, Grayden DB, Vogel AP (2021) Automatic speech recognition in neurodegener- ative disease. Int J Speech Technol 24(3):771–779
Article Google Scholar
Dai Y, Wu Z (2021) Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: a mixed-methods study, Computer Assisted Language Learning 1–24
Yadava TG, Jayanna HS (2018) Speech enhancement by combining spectral subtraction and minimum mean square error-spectrum power estimator based on zero crossing. Int J Speech Technol 22(3):639–648
Article Google Scholar
Povey D et al (2011) The Kaldi speech recognition toolkit. IEEE Signal Processing Society, IEEE Work- shop on Automatic Speech Recognition and Understanding
Google Scholar
Shahnawazuddin S, Thotappa D, Sarma BD, Deka A, Prasanna SRM, Sinha R (2013) Assamese spoken query system to access the price of agricultural commodities, National Conference on Communications 1–5
Leggetter CJ, Woodland PC (1995) Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer, Speech and Language 9(2):171–185
Article Google Scholar
Kuhn R, Junqua JC, Nguyen P, Niedzielski N (2000) Rapid speaker adapta- tion in Eigenvoice space, in IEEE Trans Speech Audio Processing 8(6):695–707
Ali A, Zhang Y, Cardinal P, Dahak N, Vogel S, Glass J (2014) A complete KALDI recipe for building Arabic speech recognition systems, IEEE Spoken Language Technology Workshop 525–529
Cardinal P, Ali A, Dehak N, Zhang Y, Hanai TA, Zhang Y, Glass JR, Vogel S (2014) Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera 2088–2092
Karpov A, Markov K, Kipyatkova I, Vazhenina D, Ronzhin A (2014) Large vocabulary Russian speech recognition using syntactico-statistical language modeling. Speech Communication 56(3):213–228
Article Google Scholar
Feng S, Kudina O, Halpern BM, Scharenborg O (2021) Quantifying bias in automatic speech recognition, arXiv:2103.15122
Miao Y, Gowayyed M, Metze F (2015) End-to-end speech recognition using deep (RNN) models and WFST-based decoding, arXiv:1507.08240
Shahnawazuddin S, Thotappa D, Dey A, Imani S, Prasanna SRM, Sinha R (2016) Improvements in IITG Assamese spoken query system: background noise suppression and alternate acoustic modeling, 1–6
Li J (2022) Recent advances in end-to-end automatic speech recognition, APSIPA Transactions on Signal and Information Processing 11(1)
Meng L, Xu J, Tan X, Wang J, Qin T, Xu B (2021) MixSpeech: data augmentation for low-resource automatic speech recognition, In IEEE international conference on acoustics, speech and signal processing, pp 7008–7012
Sailor H, Patil H (2018) Neural Networks-based automatic speech recognition for agricultural commodity in Gujarati language, proc. 6th workshop on spoken language technologies for under-resourced languages 162–166
Das R, Dey A, Lalhminghlui W, Sarmah P, Vijaya S, Sinha R (2020) Mizo spoken query system enhanced with prosodic information, IEEE 23rd conference of the oriental COCOSDA international committee for the co-ordination and standardisation of speech databases and assessment techniques 83–88
Mantena GV, Rajendran S, Gangashetty SV, Yegnanarayana B, Prahallad K (2011) Development of a spoken dialogue system for accessing agricultural information in Telugu, In Proceedings of ICON-2011, 9th international conference on natural language processing
Perero-Codosero JM, Espinoza-Cuadros FM, Hernández-Gómez LA, Luis A (2022) A comparison of hybrid and end-to-end ASR systems for the IberSpeech-RTVE 2020 speech-to-text transcription challenge. Applied Sciences 12(2):903
Article CAS Google Scholar
Zhang F, Wang Y, Zhang X, Liu C, Saraf Y, Zweig G (2020) Faster, simpler and more accurate hybrid asr systems using wordpieces, arXiv preprint arXiv:2005.09150
Yadava TG, Nagaraja BG, Jayanna HS (2022) Performance evaluation of spectral subtraction with vad and timefrequency ltering for speech enhancement, In Emerging Research in Computing, Information, Commu- nication and Applications 407–414
Defrancq B, Fantinuoli C (2021) Automatic speech recognition in the booth: assessment of system performance, interpreters performances, and inter- actions in the context of numbers. Target 33(1):73–102
Article Google Scholar
Yadav H, Sitaram S (2022) A survey of multilingual models for automatic speech recognition, arXiv:2202.12576
Aldarmaki H, Ullah A, Ram S, Zaki N (2022) Unsupervised automatic speech recognition: a review. Speech Communication 139:76–91
Article Google Scholar
Miao H, Cheng G, Zhang P, Yan Y (2020) Online hybrid CTC/atten- tion end-to-end automatic speech recognition architecture. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28:1452–1465
Article Google Scholar
Yadava TG, Jayanna HS (2018) Improvements in spoken query system to access the agricultural commodity prices and weather information in Kan- nada language/dialects. Journal of Intelligent Systems 29(1):664–687
Article Google Scholar

Download references

Funding

This work was a part of consortium project on “Speech-based Access of Agricultural Commodity Prices and Weather Information in 11 Indian Languages /Dialects, funded by the Technology Development for Indian Languages (TDIL) programme initiated by the Department of Electronics & Information Technology (DeitY), Ministry of Communication & Information Technology (MC &IT), Govt. of India (Grant number: 11(18)/2012-HCC(TDIL)).

Author information

Authors and Affiliations

E &CE, Nitte Meenakshi Institute of Technology, Yelahanka, Bengaluru, 560064, Karnataka, India
Thimmaraja Yadava G
E &CE, Vidyavardhaka College of Engineering, Gokulam 3 stage, Mysuru, 570002, Karnataka, India
Nagaraja B G
IS &E, Siddaganga Institute of Technology, B H Road, Tumkur, 572103, Karnataka, India
Jayanna H S
E &CE, Nitte Mahalinga Adyanthaya Memorial Institute of Technology, Nitte, 574110, Karnataka, India
Shivakumar B R

Authors

Thimmaraja Yadava G
View author publications
You can also search for this author in PubMed Google Scholar
Nagaraja B G
View author publications
You can also search for this author in PubMed Google Scholar
Jayanna H S
View author publications
You can also search for this author in PubMed Google Scholar
Shivakumar B R
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thimmaraja Yadava G.

Ethics declarations

Conflict of interest

Authors have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nagaraja B G, Jayanna H S and Shivakumar B R are contributed equally to this work.

Appendices

Appendix A: Considerations and challenges of the research approach

The following limitations should be considered when interpreting and applying this research findings to real-world SQS and ASR applications.

This research focuses on developing improvements to the ASR system specifically for the Kannada language/dialects. As a result, the findings and conclusions may not be directly applicable to other languages or dialects, limiting the generalizability of the approach.
The challenges of real-time data collection, such as background noise variations, environmental conditions, and other contextual factors, may impact the quality and diversity of the collected data.

Table 6 Description of speech database collection

Full size table

Table 7 Comprehensive comparison of ASR toolkits

Full size table

Appendix B: Speech database description

The Table 6 presents the speech data collected for this study, encompassing Kannada language participants (male and female) across diverse dialect regions of Karnataka state.

Appendix C: Comparison of ASR toolkits

Appendix D: List of Acronyms

Table 8 List of acronyms used in the research work

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

G, T.Y., G, N.B., S, J.H. et al. A spoken query system to access the real time agricultural commodity prices and weather information in Kannada language/dialects. Multimed Tools Appl 83, 28675–28688 (2024). https://doi.org/10.1007/s11042-023-16554-9

Download citation

Received: 30 December 2022
Revised: 01 August 2023
Accepted: 18 August 2023
Published: 07 September 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11042-023-16554-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A spoken query system to access the real time agricultural commodity prices and weather information in Kannada language/dialects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improvements in ASR system to access the real-time agricultural commodity prices and weather information in Kannada language/dialects

Improvements in IITG Assamese Spoken Query System: Background Noise Suppression and Alternate Acoustic Modeling

A spoken query system for the agricultural commodity prices and weather information access in Kannada language

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Considerations and challenges of the research approach

Appendix B: Speech database description

Appendix C: Comparison of ASR toolkits

Appendix D: List of Acronyms

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A spoken query system to access the real time agricultural commodity prices and weather information in Kannada language/dialects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improvements in ASR system to access the real-time agricultural commodity prices and weather information in Kannada language/dialects

Improvements in IITG Assamese Spoken Query System: Background Noise Suppression and Alternate Acoustic Modeling

A spoken query system for the agricultural commodity prices and weather information access in Kannada language

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Considerations and challenges of the research approach

Appendix B: Speech database description

Appendix C: Comparison of ASR toolkits

Appendix D: List of Acronyms

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation