Abstract
While vision-based localization techniques have been widely studied for small autonomous unmanned vehicles (SAUVs), sound-source localization capabilities have not been fully enabled for SAUVs. This paper presents two novel approaches for SAUVs to perform three-dimensional (3D) multi-sound-sources localization (MSSL) using only the inter-channel time difference (ICTD) signal generated by a self-rotating bi-microphone array. The proposed two approaches are based on two machine learning techniques viz., Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Random Sample Consensus (RANSAC) algorithms, respectively, whose performances were tested and compared in both simulations and experiments. The results show that both approaches are capable of correctly identifying the number of sound sources along with their 3D orientations in a reverberant environment.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
The data that support the findings of this study are available from the corresponding author, Deepak Gala, upon reasonable request.
References
Wang, Q., Ren, K., Zhou, M., Lei, T., Koutsonikolas, D., Su, L.: Messages behind the sound: real-time hidden acoustic signal capture with smartphones. In: Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking, pp 29–41. ACM (2016)
Böhme, H.-J., Wilhelm, T., Key, J., Schauer, C., Schröter, C., Groß, H.-M., Hempel, T.: An approach to multi-modal human–machine interaction for intelligent service robots. Robot. Auton. Syst. 44(1), 83–96 (2003)
Murray, J.C., Erwin, H., Wermter, S.: Robotics sound-source localization and tracking using interaural time difference and cross-correlation. In: AI Workshop on NeuroBotics (2004)
Borenstein, J., Everett, H., Feng, L.: Navigating mobile robots: systems and techniques. A K Peters Ltd. (1996)
Rabinkin, D.V.: Optimum sensor placement for microphone arrays, Ph.D. dissertation, RUTGERS The State University of New Jersey - New Brunswick (1998)
Brandstein, M., Ward, D.: Microphone Arrays: Signal Processing Techniques and Applications. Springer Science & Business Media, New York (2013)
Wallach, H.: On sound localization. J. Acoust. Soc. Am. 10(4), 270–274 (1939)
Lee, S., Park, Y., Park, Y.-s.: Three-dimensional sound source localization using inter-channel time difference trajectory. Int. J. Adv. Robot. Syst. 12(12), 171 (2015)
Handzel, A.A., Krishnaprasad, P.: Biomimetic sound-source localization. IEEE Sensors J. 2 (6), 607–616 (2002)
Eriksen, G.H.: Visualization tools and graphical methods for source localization and signal separation, Master’s thesis, Universityof OSLO Department of Informatics (2006)
Zhong, X., Yost, W., Sun, L.: Dynamic binaural sound source localization with ITD cues: Human listeners. J. Acoust. Soc. Am. 137(4), 2376–2376 (2015)
Gala, D., Lindsay, N., Sun, L.: Three-dimensional sound source localization for unmanned ground vehicles with a self-rotational two-microphone array. In: Proceedings of the 5th international conference of control, dynamic systems, and robotics (CDSR’18), pp 104.1–104.11 (2018)
Valin, J.-M., Michaud, F., Rouat, J., Létourneau, D.: Robust sound source localization using a microphone array on a mobile robot. In: Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003), vol. 2, pp 1228–1233. IEEE (2003)
Sun, L., Cheng, Q.: Indoor multiple sound source localization using a novel data selection scheme. In: 48th Annual Conference on Information Sciences and Systems (CISS), pp 1–6. IEEE (2014)
Zhong, X., Sun, L., Yost, W.: Active binaural localization of multiple sound sources. Robot. Auton. Syst. 85, 83–92 (2016)
Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Sig. Process. 92(8), 1950–1960 (2012)
Swartling, M., Sällberg, B., Grbić, N.: Source localization for multiple speech sources using low complexity non-parametric source separation and clustering. Sig. Process. 91(8), 1781–1788 (2011)
Dong, T., Lei, Y., Yang, J.: An algorithm for underdetermined mixing matrix estimation. Neurocomputing 104, 26–34 (2013)
Yilmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)
Pavlidi, D., Griffin, A., Puigt, M., Mouchtaris, A.: Real-time multiple sound source localization and counting using a circular microphone array. IEEE Trans. Audio Speech Lang. Process. 21(10), 2193–2206 (2013)
Loesch, B., Yang, B.: Source number estimation and clustering for underdetermined blind source separation. In: International Workshop on Acoustic Signal Enhancement (IWAENC), Seattle Washington, USA (2008)
Zhong, X., Sun, L., Yost, W.: Active binaural localization of multiple sound sources. Robot. Auton. Syst. 85, 83–92 (2016)
Catalbas, M.C., Dobrisek, S.: 3D moving sound source localization via conventional microphones. Elektronika ir Elektrotechnika 23(4), 63–69 (2017)
Traa, J., Smaragdis, P.: Blind multi-channel source separation by circular-linear statistical modeling of phase differences. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4320–4324. IEEE (2013)
Gala, D., Sun, L.: Moving sound source localization and tracking using a self rotating bi-microphone array. In: Dynamic Systems and Control Conference, vol. 59148, p V001T09A002. American Society of Mechanical Engineers (2019)
Gala, D., Lindsay, N., Sun, L.: Realtime active sound source localization for unmanned ground robots using a self-rotational bi-microphone array. J. Intell. Robot. Syst. 95(3-4), 935–954 (2019)
Gala, D.: Sound source localization and tracking using a self-rotating bi-microphone array, Ph.D. dissertation New Mexico State University (2019)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, no. 34, vol. 96, pp 226–231 (1996)
Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Sig. Process. 24(4), 320–327 (1976)
Azaria, M., Hertz, D.: Time delay estimation by generalized cross correlation methods. IEEE Trans. Acoust. Speech Sig. Process. 32(2), 280–285 (1984)
Naylor, P., Gaubitch, N.D.: Speech Dereverberation. Springer Science & Business Media, New York (2010)
Gala, D.R., Vasoya, A., Misra, V.M.: Speech enhancement combining spectral subtraction and beamforming techniques for microphone array. In: Proceedings of the International Conference and Workshop on Emerging Trends in Technology (ICWET), pp 163–166 (2010)
Gala, D.R., Misra, V.M.: SNR improvement with speech enhancement techniques. In: Proceedings of the International Conference and Workshop on Emerging Trends in Technology (ICWET), pp 163–166. ACM (2011)
International Organization for Standardization (ISO): British, European and International Standards (BSEN), Noise emitted by machinery and equipment – Rules for the drafting and presentation of a noise test code, 12001: Acoustics (1997)
Goelzer, B., Hansen, C.H., Sehrndt, G.: Occupational exposure to noise: evaluation, prevention and control. World Health Organisation (2001)
Calmes, L.: Biologically inspired binaural sound source localization and tracking for mobile robots. Ph.D. dissertation, RWTH Aachen University (2009)
Raj, C.D.: Comparison of K means K medoids DBSCAN algorithms using DNA microarray dataset. Int. J. Comput. Appl. Math. (IJCAM) (2017)
Farmani, N., Sun, L., Pack, D.J.: A scalable multitarget tracking system for cooperative unmanned aerial vehicles. IEEE Trans. Aerosp. Electron. Syst. 53(4), 1947–1961 (2017)
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Trans. Database Syst. (TODS) 42(3), 1–21 (2017)
Donohue, K.D.: Audio array toolbox. [Online] Available: https://github.com/UKY-Distributed-Audio-Lab/Array-Toolbox (2021)
Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)
Donohue, K.D.: Audio systems lab experimental data - single-track single-speaker speech. [Online] Available: http://web.engr.uky.edu/donohue/audio/Data/audioexpdata.htm (2019)
Stehman, S.V.: Selecting and interpreting measures of thematic classification accuracy. Remote Sens. Environ. 62(1), 77–89 (1997)
Grondin, F., Glass, J.: Svd-phat: A fast sound source localization method. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4140–4144. IEEE (2019)
Coteli, M.B., Olgun, O., Hacihabiboglu, H.: Multiple sound source localization with steered response power density and hierarchical grid refinement. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 26 (11), 2215–2229 (2018)
Sun, H., Teutsch, H., Mabande, E., Kellermann, W.: Robust localization of multiple sources in reverberant environments using eb-esprit with spherical microphone arrays. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 117–120. IEEE (2011)
Jarrett, D.P., Habets, E.A., Naylor, P.A.: 3d source localization in the spherical harmonic domain using a pseudointensity vector. In: 2010 18th European Signal Processing Conference, pp 442–446. IEEE (2010)
Moore, A.H., Evers, C., Naylor, P.A., Moore, A.H., Evers, C., Naylor, P.A.: Direction of arrival estimation in the spherical harmonic domain using subspace pseudointensity vectors. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 25(1), 178–192 (2017)
Nadiri, O., Rafaely, B.: Localization of multiple speakers under high reverberation using a spherical microphone array and the direct-path dominance test. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 22(10), 1494–1505 (2014)
Jia, M., Sun, J., Bao, C., Ritz, C.: Multiple-to-single sound source localization by applying single-source bins detection. Appl. Acoust. 138, 28–38 (2018)
Sasaki, Y., Kagami, S., Mizoguchi, H.: Multiple sound source mapping for a mobile robot by self-motion triangulation. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 380–385. IEEE (2006)
Funding
All authors confirm that there has been no significant financial support for this work that could have influenced its outcome.
Author information
Authors and Affiliations
Contributions
Deepak Gala, developed the theoretical formalism, performed analytical analysis, performed numerical simulations, and planned the experiments. Also, he wrote the first draft of the manuscript, prepared relevant materials, and conducted result analyses. Nathan Lindsay, contributed to the design, prototyping, and integration of hardware components for the experimental platform, as well as conducting the experiments and data collection. Liang Sun, as the research advisor of the first and second authors, initiated the research work presented in the paper, developed the research plan for methodologies, simulations, experiments, analysis, and data collection, provided guidance for research discussions. All authors read and approved the revised manuscript.
Corresponding author
Ethics declarations
Ethical Approval
No ethical approval was deemed necessary.
Consent to Participate
All authors voluntarily agreed to participate in this research study.
Consent for Publication
All authors confirm:
– that the work described has not been published before (except in the form of an abstract or as part of a published lecture, review, or thesis);
– that it is not under consideration for publication elsewhere;
– that its publication has been approved by all co-authors;
– that its publication has been approved (tacitly or explicitly) by the responsible authorities at the institution where the work is carried out.
All authors give their consent for information about themselves to be published in the Journal of Intelligent & Robotic Systems. All authors transfers their exclusive right to the presented paper, including the right to publish the paper in the Journal of Intelligent & Robotic Systems.
Competing interests
All authors confirm that there are no known conflicts of interest associated with this publication.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gala, D., Lindsay, N. & Sun, L. Multi-Sound-Source Localization Using Machine Learning for Small Autonomous Unmanned Vehicles with a Self-Rotating Bi-Microphone Array. J Intell Robot Syst 103, 52 (2021). https://doi.org/10.1007/s10846-021-01481-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01481-4