Computational Visual Attention | SpringerLink
Skip to main content

Computational Visual Attention

  • Chapter
Computer Analysis of Human Behavior

Abstract

Visual attention is one of the key mechanisms of perception that enables humans to efficiently select the visual data of most potential interest. Machines face similar challenges as humans: they have to deal with a large amount of input data and have to select the most promising parts. In this chapter, we explain the underlying biological and psychophysical grounding of visual attention, show how these mechanisms can be implemented computationally, and discuss why and under what conditions machines, especially robots, profit from such a concept.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
JPY 7149
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The social aspect of human attention is described in Chap. 8, Sect. 8.6.4.1

  2. 2.

    Parts of this chapter have been published before in [5].

  3. 3.

    The notation V1 to V5 comes from the former belief that the visual processing would be serial.

  4. 4.

    In this chapter, we assume that the reader has basic knowledge on image processing, otherwise you find a short explanation of the basic concepts in the appendix of [5].

  5. 5.

    While the description here is essentially the same as in [5], some improvements have been made in the meantime that are included here. Differences of VOCUS from the iNVT can be found in [5].

  6. 6.

    The number of levels that is reasonable depends on the image size, as well as on the size of the objects you want to detect. Larger images and a wide variety of possible object sizes require deeper pyramids. The presented approach usually works well for images of up to 400 pixels in width and height in which the objects are comparatively small as in the example images of this chapter.

  7. 7.

    Since the input is a static image, the motion channel is empty and omitted here.

  8. 8.

    Entries with value 1 are ignored since they indicate that the mean saliency of the target region is exactly the same as the mean saliency of the surrounding; such a feature is completely useless for detecting the target. However, in practice this usually does not occur unless a feature is not present at all, e.g., color is not present in a gray-scale image and the color weights are set to 1.

  9. 9.

    Note that in human perception, bottom-up cues always play a role and thus should be considered if similarity to human perception is desired.

  10. 10.

    More on http://web.me.com/john.tsotsos/Applications/Playbot.html.

References

  1. Bruce, N.D.B., Tsotsos, J.K.: Saliency, attention, and visual search: An information theoretic approach. J. Vis. 9(3), 1–24 (2009)

    Article  Google Scholar 

  2. Bundesen, C., Habekost, T.: Attention. In: Lamberts, K., Goldstone, R. (eds.) Handbook of Cognition. Sage, London (2005)

    Google Scholar 

  3. Douma, M.: Color Vision and Art. Retrieved Nov 2010 from http://webexhibits.org/colorart/ganglion.html (2008)

  4. Elazary, L., Itti, L.: Interesting objects are visually salient. J. Vis. 8(3), 3 (2008)

    Article  Google Scholar 

  5. Frintrop, S.: VOCUS: A Visual Attention System for Object Detection and Goal-Directed Search. Lecture Notes in Artificial Intelligence (LNAI), vol. 3899. Springer, Berlin/Heidelberg (2006)

    Book  Google Scholar 

  6. Frintrop, S., Jensfelt, P.: Attentional landmarks and active gaze control for visual SLAM. IEEE Trans. Robot. 24(5) (2008). Special Issue on Visual SLAM

    Google Scholar 

  7. Frintrop, S., Rome, E., Christensen, H.I.: Computational visual attention systems and their cognitive foundations: A survey. ACM Trans. Appl. Percept. 7(1) (2010)

    Google Scholar 

  8. Gao, D., Han, S., Vasconcelos, N.: Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(6) (2009)

    Google Scholar 

  9. Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vis. Res. 49(10), 1295–1306 (2009)

    Article  Google Scholar 

  10. Itti, L., Koch, C.: Feature combination strategies for saliency-based visual attention systems. J. Electron. Imaging 10(1), 161–169 (2001)

    Article  Google Scholar 

  11. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)

    Article  Google Scholar 

  12. James, W.: The Principles of Psychology. Dover, New York (1890)

    Book  Google Scholar 

  13. Kandel, E.R., Schwartz, J.H., Jessell, T.M.: Essentials of Neural Science and Behavior. McGraw-Hill/Appleton & Lange, New York (1996)

    Google Scholar 

  14. Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–227 (1985)

    Google Scholar 

  15. Liu, T., Zejian, Y., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.-Y.: Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 353–367 (2009)

    Google Scholar 

  16. Palmer, S.E.: Vision Science: Photons to Phenomenology. MIT Press, Cambridge (1999)

    Google Scholar 

  17. Pashler, H.: The Psychology of Attention. MIT Press, Cambridge (1997)

    Google Scholar 

  18. Rotenstein, A., Andreopoulos, A., Fazl, E., Jacob, D., Robinson, M., Shubina, K., Zhu, Y., Tsotsos, J.K.: Towards the dream of intelligent, visually-guided wheelchairs. In: Proc. 2nd Int’l Conf. on Technology and Aging, Toronto, Canada, June 2007

    Google Scholar 

  19. Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: The role of global features on object search. Psychol. Rev. 113(4) (2006)

    Google Scholar 

  20. Treisman, A.: Preattentive processing in vision. Comput. Vis. Graph. Image Process. 31, 156–177 (1985)

    Article  Google Scholar 

  21. Treisman, A.M., Gelade, G.: A feature integration theory of attention. Cogn. Psychol. 12, 97–136 (1980)

    Article  Google Scholar 

  22. Treisman, A.M., Gormican, S.: Feature analysis in early vision: Evidence from search asymmetries. Psychol. Rev. 95(1), 15–48 (1988)

    Article  Google Scholar 

  23. Tsotsos, J.K.: A ‘complexity level’ analysis of vision. In: Proc. of International Conference on Computer Vision: Human and Machine Vision Workshop, London, England, June 1987

    Google Scholar 

  24. Tsotsos, J.K.: A Computational Perspective on Visual Attention. MIT Press, Cambridge (2011)

    Google Scholar 

  25. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)

    Article  Google Scholar 

  26. Walther, D., Koch, C.: Modeling attention to salient proto-objects. Neural Networks (2006)

    Google Scholar 

  27. Wolfe, J.M.: Guided search 2.0: A revised model of visual search. Psychon. Bull. Rev. 1(2), 202–238 (1994)

    Article  Google Scholar 

  28. Wolfe, J.M.: Visual search. In: Pashler, H. (ed.) Attention, pp. 13–74. Psychology Press, Hove (1998)

    Google Scholar 

  29. Wolfe, J.M., Horowitz, T.S.: What attributes guide the deployment of visual attention and how do they do it? Nat. Rev., Neurosci. 5, 1–7 (2004)

    Article  Google Scholar 

  30. Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TVL 1 optical flow. In: Proc. of the Annual Meeting of the German Assoc. for Pattern Recognition (DAGM) (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Frintrop .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Frintrop, S. (2011). Computational Visual Attention. In: Salah, A., Gevers, T. (eds) Computer Analysis of Human Behavior. Springer, London. https://doi.org/10.1007/978-0-85729-994-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-994-9_4

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-993-2

  • Online ISBN: 978-0-85729-994-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics