Abstract
Regular pitch detection algorithms are known to be immensely useful for speech source analysis. Their utility is not as reliable when processing polyphonic acoustic mixtures like Music. This is an investigative study of music components like rhythm, accompaniment and Lyrical-voicing, that is seen as a critical task towards targeted music component identification and processing. Popular music forms like Western and Hindustani Classical are considered for our study dataset. For Western cases, comparative preliminary analysis of the spectral characteristics like Harmonics and Energy is done towards characterization of Music region against that of Lyrics-music mixture. \(F_{0}\) contour analysis for these regions, using Autocorrelation and Zero frequency filtering indicates the utility of the latter in Lyrical-voicing onset identification. Short-time spectral analysis leads to the distinctive understanding about the Harmonic structure according to the music polyphony. Strength of Excitation is found to be insightful towards characterizing sounds like base sounds, prominent in percussion instruments. For study on Classical music, \(F_{0}\) contour analysis using raw signal and LP Residual elucidate the characteristic average pitch effect, which comes out to be higher for the Alaap region in case of Female artists and Lyrics composition regions for the Male artists, giving cues towards the applications like Raaga identification and summarization. The analysis of the excitation source features for various music components done in this work present some insightful observations and clues towards effective Music component processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Ocean of Ragas: A dedicated collection of 1800+ Ragas of Hindustani Classical Music [Online]. http://www.oceanofragas.com.
References
de Cheveigne, A.: Yin, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111(4), 1917–1930 (2002). doi:10.1121/1.1458024
Ephraim, Y., Malah, D.: Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 32(6), 1109–1121 (1984)
Haykin, S.: An Introduction to Analog and Digital Communications. Wiley, New York (1989). http://www.loc.gov/catdir/toc/onix02/88015512.html
Li, Y., Wang, D.: Separation of singing voice from music accompaniment for monaural recordings. Trans. Audio, Speech Lang. Proc. 15(4), 1475–1487 (2007). doi:10.1109/TASL.2006.889789
Martin, R.: Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 9(5), 504–512 (2001)
Mittal, V.K., Yegnanarayana, B.: Significance of aperiodicity in the pitch perception of expressive voices. In: INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, 14–18 September, 2014, pp. 504–508 (2014). http://www.isca-speech.org/archive/interspeech_2014/i14_0504.html
Mittal, V.K., Yegnanarayana, B.: Study of characteristics of aperiodicity in Noh voices. J. Acoust. Soc. Am. 137(6) (2015)
Ockelford, A.: Repetition in music: theoretical and metatheoretical perspectives. In: Royal Musical Association Monographs. Farnham, U.K., Ashgate (2005)
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-time signal processing, 2nd edn. Prentice-Hall Inc., Upper Saddle River (1999)
Ozerov, A., Philippe, P., Bimbot, F., Gribonval, R.: Adaptation of bayesian models for single-channel source separation and its application to voice/music separation in popular songs. IEEE Trans. Audio, Speech Lang. Process. 15(5), 1564–1578 (2007). doi:10.1109/TASL.2007.899291
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall Inc., Upper Saddle River (1993)
Rafii, Z., Pardo, B.: Repeating pattern extraction technique (repet): a simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013). doi:10.1109/TASL.2012.2213249
Rao, V., Ramakrishnan, S., Rao, P.: Singing voice detection in north indian classical music. In: Proceedings of the National Conference on Communications (NCC) (2008)
Sharma, S., Mittal, V.K.: Singing characterization using temporal and spectral features in indian musical notes. In: 2016 International Conference on Signal Processing and Communication. JIIT, Noida (2016)
Sharma, S., Mittal, V.K.: Window selection for accurate music source separation using repet. In: 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN), pp. 270–274 (2016). doi:10.1109/SPIN.2016.7566702
Sjölander, K., Beskow, J.: Wavesurfer-an open source speech tool
Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)
Yegnanarayana, B., Murty, K.S.R.: Event-based instantaneous fundamental frequency estimation from speech signals. IEEE Trans. Audio Speech Lang. Process. 17(4), 614–624 (2009). doi:10.1109/TASL.2008.2012194
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Sharma, S., Ghisingh, S., Mittal, V.K. (2018). Component Characterization of Western and Indian Classical Music. In: Thampi, S., Krishnan, S., Corchado Rodriguez, J., Das, S., Wozniak, M., Al-Jumeily, D. (eds) Advances in Signal Processing and Intelligent Recognition Systems. SIRS 2017. Advances in Intelligent Systems and Computing, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-319-67934-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-67934-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67933-4
Online ISBN: 978-3-319-67934-1
eBook Packages: EngineeringEngineering (R0)