Abstract
In the last few years, a revolution has occurred in the area of consumer audio. Similarly to the transition from analog to digital sound that took place during the 80s, we have been experiencing the transition from 2-channel stereophonic sound to multichannel sound (e.g., 5.1 systems). Future audiovisual systems will not make distinctions regarding whether the user will be watching a movie or listening to a music recording; they are envisioned to offer a realistic experience to the user who will be immersed into the content, implying that the user will be able to interact with the content according to his will. In this paper, an encoding procedure is proposed, focusing on spot microphone signals, which is necessary for providing interactivity between the user and the environment. A model is proposed which achieves high-quality audio reproduction with side information for each spot microphone signal in the order of 19 kbps.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
ITU-R BS.1116: Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems, 1994. International Telecommunications Union, Geneva, Switzerland(1994)
ISO/IEC JTC1/SC29/WG11 (MPEG) International Standard ISO/IEC 11172-3: Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s (1992)
Brandenburg, K.: MP3 and AAC explained. In: Proc. 17th International Conference on High Quality Audio Coding of the Audio Engineering Society (AES) (September 1999)
ISO/IEC JTC1/SC29/WG11 (MPEG) International Standard ISO/IEC 13818-7: Generic coding of moving pictures and associated audio: Advanced audio coding, 1997 (1997)
Bosi, M., Brandenburg, K., Quackenbush, S., Fielder, L., Akagiri, K., Fuchs, H., Dietz, M., Herre, J., Davidson, G., Oikawa, Y.: ISO/IEC MPEG-2 advanced audio coding. In: Proc. 101st Convention of the Audio Engineering Society (AES), preprint No. 4382, Los Angeles, CA (November 1996)
Davis, M.: The AC-3 multichannel coder. In: Proc. 95th Convention of the Audio Engineering Society (AES), preprint No. 3774, New York, NY (October 1993)
Johnston, J.D., Ferreira, A.J.: Sum-difference stereo transform coding. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process (ICASSP), pp. 569–572 (1992)
Herre, J., Brandenburg, K., Lederer, D.: Intensity stereo coding. In: Proc. 96th Convention of the Audio Engineering Society (AES), preprint No. 3799 (February 1994)
Breebaart, J., Herre, J., Faller, C., Roden, J., Myburg, F., Disch, S., Purnhagen, H., Hotho, G., Neusinger, M., Kjorling, K., Oomen, W.: MPEG Spatial Audio Coding / MPEG Surround: Overview and current status. In: Proc. AES 119th Convention, Paper 6599, New York, NY (October 2005)
Baumgarte, F., Faller, C.: Binaural Cue Coding - Part I: Psychoacoustic Fundamentals and Design Principles. IEEE Trans. on Speech and Audio Proc. 11(6), 509–519 (2003)
Breebaart, J., van de Par, S., Kohlrausch, A., Schuijers, E.: Parametric coding of stereo audio. EURASIP Journal on Applied Signal Processing 9, 1305–1322 (2005)
Tzagkarakis, C., Mouchtaris, A., Tsakalides, P.: Modeling spot microphone signals using the sinusoidal plus noise approach. In: Proc. Workshop on Appl. of Signal Proc. to Audio and Acoust. (October 2007)
Vafin, R., Prakash, D., Kleijn, W.B.: On Frequency Quantization in Sinusoidal Audio Coding. IEEE Signal Proc. Letters 12(3), 210–213 (2005)
Subramaniam, A.D., Rao, B.D.: PDF optimized parametric vector quantization of speech line spectral frequencies. IEEE Trans. on Speech and Audio Proc. 11, 365–380 (2003)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)
Karadimou, K., Mouchtaris, A., Tsakalides, P.: Multichannel Audio Modeling and Coding Using a Multiband Source/Filter Model. In: Conf. Record of the Thirty-Ninth Asilomar Conf. Signals, Systems and Computers, pp. 907–911 (2005)
McAulay, R.J., Quatieri, T.F.: Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. Acoust., Speech, and Signal Process. 34(4), 744–754 (1986)
Stylianou, Y.: Applying the harmonic plus noise model in concatenative speech synthesis. IEEE Trans. Speech and Audio Process. 9(1), 21–29 (2001)
Serra, X., Smith, J.O.: Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Computer Music Journal 14(4), 12–24 (1990)
Goodwin, M.: Residual modeling in music analysis-synthesis. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process (ICASSP), pp. 1005–1008 (May 1996)
Hendriks, R.C., Heusdens, R., Jensen, J.: Perceptual linear predictive noise modeling for sinusoid-plus-noise audio coding. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process (ICASSP), pp. 189–192 (May 2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mouchtaris, A., Tzagkarakis, C., Tsakalides, P. (2008). Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications. In: Tsihrintzis, G.A., Virvou, M., Howlett, R.J., Jain, L.C. (eds) New Directions in Intelligent Interactive Multimedia. Studies in Computational Intelligence, vol 142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68127-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-68127-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68126-7
Online ISBN: 978-3-540-68127-4
eBook Packages: EngineeringEngineering (R0)