Abstract
Effective encoding and indexing of audiovisual documents are two key aspects for enhancing the multimedia user experience. In this paper we propose the embedding of low-level content descriptors into a scalable video-coding bitstream by jointly optimizing encoding and indexing performance. This approach provides a new type of bitstream where part of the information is used for both content encoding and content description, allowing the so called “Midstream Content Access”. To support this concept, a novel technique based on the appropriate combination of Vector Quantization and Scalable Video Coding has been developed and evaluated. More specifically, the key-pictures of each video Group Of Pictures (GOP) are encoded at a first draft level by using a suitable visual-codebook, while the residual errors are encoded using a conventional approach. The same visual-codebook is also used to encode all the key-pictures of a video shot, where boundaries are dynamically estimated. In this way, the visual-codebook is freely available as an efficient visual descriptor of the considered video shot. Moreover, since a new visual-codebook is introduced every time a new shot is detected, also an implicit temporal segmentation is provided.
Similar content being viewed by others
References
Adami N, Signoroni A, Leonardi R (2007) State-of-the-art and trends in scalable video compression with wavelet-based approaches. IEEE Trans Circuits and Syst Video Technol 9(17):1238–1255
Adami N, Boschetti A, Leonardi R, Migliorati P (2008) Scalable coding of image collections with embedded descriptors. In: Proc. of MMSP-2008. Cairns, Queensland, Australia, pp 388–392
Benini S, Bianchetti A, Leonardi R, Migliorati P (2006) Extraction of significant video summaries by dendrogram analysis. In: Proc. of international conference on image processing ICIP’06, Atlanta
Burnett IS, Pereira F, de Walle RV, Koenen R (2006) The MPEG-21 book. Wiley, New York
Chang SF, Ma WY, Smeulders A (2007) Recent advances and challenges of semantic image/video search. In: Proc. of ICASSP-2007, Hawaii
Elkan C (2003) Using the triangle inequality to accelerate k-means. In: Proc. of ICML, pp 147–153, Washington, DC
Gersho A, Gray RM (1991) Vector quantization and signal compression. Kluwer Academic, Norwell
Hanjalic A, Lagendijk R, Biemond J (1999) Efficient image codec with reduced content access work. In: Proc. of ICIP, pp 807–811, Kobe
Izquierdo E et al (2005) State of the art in content-based analysis, indexing and retrieval. In: IST-2001-32795 SCHEMA Del. 2.1
Manjunath B, Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description language. Wiley, New York
Morand C, Benois-Pineau J, Domenger J (2008) Scalable indexing of HD video. In: Proc. Content-Based multimedia indexing 2008, pp 417–424. doi:10.1109/CBMI.2008.4564977
Morand C, Benois-Pineau J, Domenger JP, Mansencal B (2007) Object-based indexing of compressed video content: From sd to hd video. In: ICIAPW ’07: Proceedings of the 14th international conference of image analysis and processing—workshops. IEEE Computer Society, Washington, DC, pp 71–76. doi:10.1109/ICIAPW.2007.34
NIST (2008) Guidelines for the trecvid 2007 evaluation—shot boundary detection task http://www-nlpir.nist.gov/projects/tv2007/tv2007.html
Picard RW (1994) Content access for image/video coding: the fourth criterion. Tech. Rep. 295, MIT Media Laboratory—Perceptual Computing Section, Cambridge
Qiu G (2004) Embedded colour image coding for content-based retrieval. J Vis Commun Image Represent 15(4):507–521
Saraceno C, Leonardi R (1998) Indexing audio-visual databases through a joint audio and video processing. Int J Imaging Syst Technol 9(5):320–331
Schaefer G, Qiu G (2004) Midstream content access of visual pattern coded imagery. In: Proc. of 2004 conference on computer vision and pattern recognition, pp 144–149
Standard S (2009) Material exchange format (mxf)—file format specification, smpte 0377-1-2009
Swanson MD, Hosur S, Tewfik AH, Ansari R, Smith MJT (1996) Image coding for content-based retrieval. In: Visual communications and image processing ’96, vol 2727, SPIE, Orlando, pp 4–15. http://link.aip.org/link/?PSI/2727/4/1
Taubman D (2000) High performance scalable image compression with EBCOT. IEEE Trans Image Processing 9:1158–1170
Taubman DS, Marcellin MW (2001) JPEG 2000: image compression fundamentals, standards and practice. Kluwer Academic, Norwell
Wang H, Cheung NM, Ortega A (2006) A framework for adaptive scalable video coding using wyner-ziv techniques. EURASIP J Appl Signal Process. doi:10.1155/ASP/2006/60971
Zhang H, Wang J, Altunbasak Y (1997) Content-based video retrieval and compression: a unified solution. In: Image processing, proceedings, international conference, vol 1, pp 13–16. doi:10.1109/ICIP.1997.647372
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Adami, N., Boschetti, A., Leonardi, R. et al. Embedded indexing in scalable video coding. Multimed Tools Appl 48, 105–121 (2010). https://doi.org/10.1007/s11042-009-0356-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0356-y