Abstract
A scene change detection method is presented in this paper, which analyzes both auditory and visual information sources and accounts for their inter-relations and coincidence to semantically identify video scenes. Audio analysis focuses on the segmentation of the audio source into three types of semantic primitives, i.e. silence, speech and music. Further processing on speech segments aims at locating speaker change instants. Video analysis attempts to segment the video source into shots, without the segmentation being affected by camera pans, zoom-ins/outs or significantly high object motion. Results from single source segmentation are in some cases suboptimal. Audio-visual interaction achieves to either enhance single source findings or extract high level semantic information. The aim of this paper is to identify semantically meaningful video scenes by exploiting the temporal correlations of both sources based on the observation that semantic changes are characterized by significant changes in both information sources. Experimentation has been carried on a real TV serial sequence composed of many different scenes with plenty of commercials appearing in-between. The results are proven to be rather promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. Correia and F. Pereira, “The role of analysis in content-based video coding and indexing”, Signal Processing, Elsevier, vol. 66, no. 2, pp. 125–142, 1998.
A. Del Bimbo, Visual Information Retrieval, Morgan Kaufmann Publishers, Inc., San Francisco, California, 1999.
M.R. Naphade, R. Mehrotra, A.M. Ferman, J. Warnick, T.S. Huang, and A.M. Tekalp, “A high-performance shot boundary detection algorithm using multiple cues”, in Proc. of 1998 IEEE Int. Conf. on Image Processing, Chicago, Illinois, USA, 4-7 Oct. 1998, vol. 1, pp. 884–887.
N. Dimitrova, T. McGee, H. Elenbaas, and J. Martino, “Video content management in consumer devices”, IEEE Trans. on Knowledge and Data Engineering, vol. 10, no. 6, pp. 988–995, 1998.
R. Lienhart, S. Pfeiffer, and W. Effelsberg, “Scene determination based on video and audio features”, in Proc. of 1999 IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Italy, 1999, pp. 685–690.
C. Saraceno and R. Leonardi, “Identi-cation of story units in audio-visual sequences by joint audio and video processing”, in Proc. of 1998 IEEE Int. Conf. on Image Processing, Chicago, Illinois, USA, 4-7 Oct. 1998, vol. 1, pp. 363–367.
C. Saraceno, “Video content extraction and representation using a joint audio and video processing”, in Proc. of 1999 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 15-19 Mar. 1999, vol. 6, pp. 3033–3036.
S. Tsekeridou and I. Pitas, “Audio-visual content analysis for content-based video indexing”, in Proc. of 1999 IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Italy, 1999, vol. I, pp. 667–672.
L. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Englewood Cliffs, N.J.: Prentice Hall, 1978.
S.B. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357–366, 1980.
P. Delacourt and C. Wellekens, “Audio data indexing: Use of second-order statistics for speaker-based segmentation”, in Proc. of 1999 IEEE Int. Conf. on Multimedia Computing and Systems, Florence, Italy, 1999, vol. II, pp. 959–963.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsekeridou, S., Krinidis, S., Pitas, I. (2001). Scene Change Detection Based on Audio-Visual Analysis and Interaction. In: Klette, R., Gimel’farb, G., Huang, T. (eds) Multi-Image Analysis. Lecture Notes in Computer Science, vol 2032. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45134-X_16
Download citation
DOI: https://doi.org/10.1007/3-540-45134-X_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42122-1
Online ISBN: 978-3-540-45134-1
eBook Packages: Springer Book Archive