Abstract
The visualization of high resolution video on small mobile devices is still a great challenge today. Most critical are the limited display resolution and different aspect ratios of handheld mobile devices. So far, there is no retargeting algorithm available that guarantees good results for all videos. We introduce a new video retargeting approach that reduces the resolution while preserving as much of the relevant content as possible. A central component of the system selects the most suitable algorithm to adapt a given shot. We have implemented two retargeting algorithms: a region of interest (ROI) based technique, and a fast implementation of seam carving for size adaptation of videos (FSCAV). The ROI-based retargeting detects important regions like faces, objects, text, and contrast-based saliency regions. A rectangular window within the larger frame is selected that defines the visible area of the target video. If several relevant regions are detected, an artificial camera motion (pan, tilt, or zoom) may change the selected view within a shot. For seam carving, we present two extensions: The first reduces the distortion of straight lines (lines may become curved or disconnected); the second avoids jitter in the target video, limits the large memory requirements and computational effort of seam carving, and makes it applicable to video retargeting. In addition, we present a heuristic that estimates the visual quality of the target video. If the quality drops below a threshold, the ROI-based retargeting is used for this shot. User evaluations confirm a very high visual quality of our approach.




















Similar content being viewed by others
References
Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph, SIGGRAPH 2007 26(3)
Bai B, Harms J (2005) A multiview video transcoder. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM Press, New York, pp 503–506
Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded Up Robust Features. Comput Vis Image Underst (CVIU) 110(3):246–359
Beek P, Smith JR, Ebrahimi T, Suzuki T, Askelof J (2003) Metadata-driven multimedia access. IEEE Signal Process Mag 20(2):40–52. IEEE Computer Society Press
Björk N, Christopoulos C (2000) Video transcoding for universal multimedia access. In: Proceedings of the 2000 ACM workshops on multimedia. ACM Press, New York, pp 75–79
Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137
Canny JF (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698. IEEE Computer Society Press
Cardellini V, Yu P, Huang Y (2000) Collaborative proxy system for distributed web content transcoding. In: Proceedings of 9th international ACM conference on information and knowledge management. ACM Press, New York, pp 520–527
Cheng WH, Hsieh CW, Lin SK, Wang CW, Wu JL (2005) Robust algorithm for exemplar-based image inpainting. In: The international conference on computer graphics, imaging and vision. IEEE Press, New York, pp 64–69
Cheng WH, Wang CW, Wu JL (2007) Video adaptation for small display based on content recomposition. IEEE Trans Circuits Syst Video Technol 17(1):43–58
Curran K, Annesley S (2005) Transcoding media for bandwidth constrained mobile devices. In: International Journal of Network Management, vol 15(2). Wiley, New York, pp 75–88
Dong W, Bao G, Zhang X, Paul JC (2010) Interactive multi-operator image resizing and evaluation. J Comput Sci Technol 25(2)
Dong W, Paul JC (2008) Adaptive content aware image resizing. In: Eurographics 2009, vol 28(2)
Dong W, Zhou N, Paul JC, Zhang X (2009) Optimized image resizing using seam carving and scaling. ACM Trans Graph 28(5):1–10
Duda RO, Hart PE (1972) Use of the hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15
El-Alfy H, Jacobs D, Davis L (2007) Multi-scale video cropping. In: ACM international conference on multimedia, pp 97–106
Farin D (2005) Automatic video segmentation employing object/camera modeling. PhD thesis, Technische Universiteit Eindhoven, Einhoven, The Netherlands
Farin D, Haenselmann T, Kopf S, Kühne G, Effelsberg W (2003) Segmentation and classification of moving video objects. In: Furht B, Marques O (eds) Handbook of video databases: design and applications, internet and communications series, vol 8. CRC Press, Boca Raton, pp 561–591
Fischler M, Bolles R (1981) Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Communications ACM, vol 24(6). ACM Press, New York, pp 381–395
Fox A, Gribble S, Chawathe Y, Brewer E (1998) Adapting to network and client variation using infrastructural proxies: lessons and perspectives. In: IEEE Personal Communication, vol 5(4). IEEE Computer Society Press, Los Alamitos, pp 10–19
Gal R, Sorkine O, Cohen-Or D (2006) Feature-aware texturing. In: Proceedings of Eurographics symposium on rendering, pp 297–303
Guo Y, Liu F, Zhou ZH, Gleicher M (2009) Image retargeting using mesh parameterization. IEEE Trans Multimedia 11(5):856–867
Han JW, Choi KS, Wang TS, Cheon SH, Ko SJ (2009) Improved seam carving using a modified energy function based on wavelet decomposition. In: IEEE 13th international symposium on consumer electronics, pp 38 –41
Han R, Bhagwat P, LaMaire R, Mummert T, Perret V, Rubas J (1998) Dynamic adaptation in an image transcoding proxy for mobile WWW browsing. In: IEEE Personal Communication, vol 5(6). IEEE Computer Society Press, Los Alamitos, pp 8–17
Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of Alvey vision conference, pp 147–151
Harrison P (2001) A non-hierarchical procedure for re-synthesis of complex textures. In: The 9th international conference in Central Europe on computer graphics, visualization and computer vision, pp 190–197
Hjelsvold R, Vdaygiri S, Leaute Y (2001) Web–based personalization and management of interactive video. In: Proceedings of the 10th international conference on World Wide Web, pp 129–139
Hossain M, Rahman A, Saddik A (2004) A framework for repurposing multimedia content. In: Proceedings of the Canadian conference on electrical and computer engineering. IEEE Computer Society Press, Los Alamitos, pp 971–974
Hwang DS, Chien SY (2008) Content-aware image resizing using perceptual seam carving with human attention model. In: IEEE international conference on multimedia and expo, pp 1029–1032
ISO/IEC (2002) Information technology–multimedia content description interface (MPEG-7)—part 8: extraction and use of MPEG-7 descriptions. Tech. rep. TR 15938-8, ISO/IEC
ISO/IEC (2003) MPEG-21 multimedia framework—part 7: digital item adaptation (final committee draft). Tech. rep. N 5845, ISO/IEC
ISO/IEC (2004) Information technology–multimedia framework (MPEG-21)—part 1: vision, technologies and strategy. Tech. rep. TR 21000-1, ISO/IEC
Itti L, Koch C, Niebur E (1999) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Kiess J, Kopf S, Guthier B, Effelsberg W (2010) Seam carving with improved edge preservation. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7542
Kim JS, Kim JH, Kim CS (2009) Adaptive image and video retargeting technique based on fourier analysis. In: Proceedings of IEEE international conference on computer vision and pattern recognition. IEEE, New York, pp 1730–1737
Kopf S, Effelsberg W (2008) Mobile cinema: canonical processes for video adaptation. In: Multimedia Systems, vol 14(6). Springer, New York, pp 369–375
Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7256, pp 72560C-1–72560C-12
Kopf S, Haenselmann T, Farin D, Effelsberg W (2004) Automatic generation of summaries for the Web. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 5307, pp 417–428
Kopf S, Haenselmann T, Effelsberg W (2005) Enhancing curvature scale space features for robust shape classification. In: Proceedings of IEEE international conference on multimedia and expo (ICME). IEEE Computer Society Press, Los Alamitos, pp 478–481
Kopf S, Haenselmann T, Effelsberg W (2005) Robust character recognition in low-resolution images and videos. Tech. rep. TR-05-002, Department of Mathematics and Computer Science, University of Mannheim, Germany
Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. In: Proceedings of IS&T/SPIE conference on storage and retrieval methods and applications for multimedia, vol 5682, pp 114–124
Kopf S, Kiess J, Lemelson H, Effelsberg W (2009) FSCAV: Fast seam carving for size adaptation of videos. In: Proceedings of the 17th ACM international conference on multimedia. ACM, New York, pp 321–330
Kopf S, Lampi F, King T, Effelsberg W (2006) Automatic scaling and cropping of videos for devices with limited screen resolution. In: Proceedings of the 14th ACM international conference on multimedia. ACM Press, New York, pp 957–958
Krähenbühl P, Lang M, Hornung A, Gross M (2009) A system for retargeting of streaming video. In: ACM SIGGRAPH Asia. ACM, New York, pp 1–10
Lei Z, Georganas ND (2001) Context-based media adaptation in pervasive computing. In: Proceedings of IEEE Canadian conference on electrical and computer engineering, vol 2. IEEE Computer Society Press, Los Alamitos, pp 913–918
Lei Z, Georganas ND (2002) Rate adaptation transcoding for precoded video streams. In: Proceedings of the 10th ACM international conference on multimedia. ACM Press, New York, pp 127–136
Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graph (TOG) 23(3):303–308
Li Y, Tian Y, Yang J, Duan LY, Gao W (2010) Video retargeting with multi-scale trajectory optimization. In: Proceedings of the international conference on multimedia information retrieval. ACM, New York, pp 45–54
Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28(1):84–95
Liu F, Gleicher M (2003) Automatic image retargeting with fisheye-view warping. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 153–162
Liu F, Gleicher M (2006) Video retargeting: automating pan and scan. In: ACM international conference on multimedia, pp 241–250
Liu H, Jiang S, Huang Q, Xu C, Gao W (2007) Region-based visual attention analysis with its application in image browsing on small displays. In: Proceedings of the 15th international conference on multimedia, pp 305–308
Liu H, Xie X, Ma WY, Zhang HJ (2003) Automatic browsing of large pictures on mobile devices. In: ACM international conference on multimedia, pp 148–155
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. In: International Journal of Computer Vision, vol. 60(2). Kluwer, Norwell, pp 91–110
Lum W, Lau F (2002) A context-aware decision engine for content adaptation. In: IEEE Pervasive Computing, vol 1(3). IEEE Computer Society Press, Los Alamitos, pp 41–49
Ma YF, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM international conference on multimedia. ACM Press, New York, pp 374–381
Mohan R, Smith J, Li C (1999) Adapting multimedia internet content for universal access. In: IEEE Transactions on Multimedia, vol 1(1). IEEE Computer Society Press, Los Alamitos, pp 104–114
Mokhtarian F, Bober M (2003) Curvature scale space representation: theory, applications, and MPEG-7 standardization. In: Computational imaging and vision, vol 25. Kluwer, Dordrecht
Nepal S, Srinivasan U (2003) DAVE: A system for quality driven adaptive video delivery. In: Proceedings of the 5th ACM SIGMM international workshop on multimedia information retrieval. ACM Press, New York, pp 223–230
Noble B, Satyanarayanan M, Narayanan D, Tilton JE, Flinn J, RWalker K (1997) Agile application-aware adaptation for mobility. In: Proceedings of the 16th symposium on operating system principles, pp 276–287
Nurnett I (2003) MPEG-21: Goals and archievments. In: IEEE Multimedia, vol 10(6). IEEE Computer Society Press, Los Alamitos, pp 60–70
Obrenovic Z, Starcevic D, Selic B (2004) A model-driven approach to content repurposing. In: IEEE Multimedia, vol. 11(1). IEEE Computer Society Press, Los Alamitos, pp 62–71
Ren T, Liu Y, Wu G (2009) Image retargeting based on global energy optimization. In: Proceedings of the 2009 IEEE international conference on multimedia and expo. IEEE Press, Piscataway, pp 406–409
Richter S, Kühne G, Schuster O (2001) Contour-based classification of video objects. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 4315, pp 608–618
Rowley HA, Baluja S, Kanade T (1998) Neural network-based face detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 20(1). IEEE Computer Society Press, Los Alamitos, pp 23–38
Rubinstein M, Avidan S, Shamir A (2008) Improved seam carving for video retargeting. ACM Trans Graph, SIGGRAPH 2008 27(3)
Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph, SIGGRAPH 2009 28(3):1–11
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: ACM conference on human factors in computing systems, pp 771–780
Schaber P, Kopf S, Thorwirth N, Effelsberg W (2010) Semi-automatic registration of videos for improved watermark detection. In: ACM SIGMM conference on multimedia systems. ACM, New York, pp 23–34
Schneiderman H (2010) Face detection demonstration. Tech. rep., Robotics Institute, Carnegie Mellon University. http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi
Schneiderman H, Kanade T (2000) A statistical model for 3D object detection applied to faces and cars. In: Proceedings of IEEE international conference on computer vision and pattern recognition (CVPR). IEEE Computer Society Press, Los Alamitos
Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proceedings of the 4th international conference on mobile and ubiquitous multimedia, pp 247–250
Shamir A, Avidan S (2009) Seam carving for media retargeting. Commun ACM 52(1):77–85
Shanableh T, Ghanbari M (2000) Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats. In: IEEE Transactions on Multimedia, vol 2(2). IEEE Computer Society Press, Los Alamitos, pp 101–110
Smith SM, Brady JM (1997) SUSAN—new approach to low level image processing. In: International Journal of Computer Vision (IJCV), vol 23(1), pp 45–78
Steiger O, Ebrahimi T, Sanjuan D (2003) MPEG-based personalized content delivery. In: Proceedings of IEEE international conference on image processing (ICIP), vol 3. IEEE Computer Society Press, Los Alamitos, pp 45–48
Suh B, Ling H, Bederson B, Jacobs D (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104
Tao C, Jia J, Sun H (2007) Active window oriented dynamic video retargeting. In: Proceedings of the workshop on dynamical vision
Tseng B, Lin CY, Smith JR (2004) Using MPEG-7 and MPEG-21 for personalizing video. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 42–52
Vetro A (2004) MPEG-21 digital item adaptation: enabling universal multimedia access. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 84–87
Vetro A, Christopoulos T, Ebrahimi T (2003) Special issue on universal multimedia access. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 69–79
Vetro A, Chrisopoulos C, Sun H (2003) Video transcoding architectures and techniques: an overview. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 18–29
Wang J, Reinders M, Lagendijk R, Lindenberg J, Kankanhalli M (2004) Video content presentation on tiny devices. In: IEEE international conference on multimedia and expo, pp 1711–1714
Wang YS, Fu H, Sorkine O, Lee TY, Seidel HP (2009) Motion-aware temporal coherence for video resizing. ACM Trans Graph 28(5)
Wang YS, Tai CL, Sorkine O, Lee TY (2008) Optimized scale-and-stretch for image resizing. ACM Trans Graph 27(5):1–8
Wolf L, Guttmann M, Cohen-Or D (2007) Non-homogeneous content-driven video-retargeting. In: Proceedings of the eleventh IEEE international conference on computer vision
Zwicker M, Pfister H, van Baar J, Gross M (2002) EWA splatting. IEEE Trans Vis Comput Graph 8(3):223–238
Acknowledgements
The authors acknowledge the financial support granted by the Deutsche Forschungsgemeinschaft (DFG). We would like to thank the following flickr.com users for providing their images via the creative commons license: teoruiz (bridge.jpg), the_tahoe_guy (road.jpg) and digital_cat (construction_site.jpg). We thank Instituto Luce for providing historical films within the European research project ECHO. Furthermore, we would like to thank Sabine Olawsky for the development of the contrast-based saliency detection.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kopf, S., Haenselmann, T., Kiess, J. et al. Algorithms for video retargeting. Multimed Tools Appl 51, 819–861 (2011). https://doi.org/10.1007/s11042-010-0717-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0717-6