Abstract
In this review, we discuss the impact (or lack thereof) biologically motivated vision has had on computer vision in the last decades. We then summarize a number of computer vision and robotic problems for which biological models can give indications for how these can be addressed. Then we summarize important findings about the primate’s visual system and draw a number of conclusions for the development of algorithms from these findings.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Actually it has mostly been the macaque’s visual system that is the basis for neuro-physiological investigations which however shows a large degree of similarity to the human visual system.
An area is called retinotopically organized when it preserves the neighbourhood relations of the retina, i.e., the general arrangement of 2D positions. In particular the cortical areas at lower levels of the visual hierarchy are retinotopic.
The receptive field associated to a neuron is the part of the visual field that directly influences the response of the neuron.
References
Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building Rome in a day. In: International Conference on Computer Vision (ICCV). pp 72–79
Aldoma A, Fäulhammer T, Vincze M (2014) Automation of “ground truth” annotation for multi-view RGB-D object instance recognition datasets. In: International Conference on Robotics and Automation (ICRA). pp 5016–5023
Ambrus R, Bore N, Folkesson J, Jensfelt P (2014) Meta-rooms: building and maintaining long term spatial models in a dynamic world. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp 1854–1861
Andreopoulos A, Tsotsos JK (2013) 50 Years of object recognition: directions forward. Comput Vision Image Underst 117(8):827–891
Arterberry ME, Yonas A, Bensen AS (1977) Self-produced locomotion and the development of responsiveness to linear perspective and texture gradients. Dev Psychol 25:976–982
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) SURF: speeded up robust features. Comput Vision Image Underst 110(3):346–359
Bengio S, Deng L, Larochelle H, Lee H, Salakhutdinov R (guest eds) (2013) Special section on learning deep architectures. Pattern analysis and machine intelligence, IEEE Transactions on 35(8)
Berkes P, Wiskott L (2005) Slow feature analysis yields a rich repertoire of complex cell properties. J Vision 5(6):579–602
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. Pattern Anal Mach Intell IEEE Trans 35(1):185–207
Borji A, Sihite DN, Itti L (2012) Probabilistic learning of task-specific visual attention. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 470–477
Boyer KL, Sarkar S (1999) Perceptual organisation in computer vision: status, challenges and potential. Guest Editor Comput Vision Image Underst 76(1):1–5
Canny JF (1986) A computational approach to edge detection. Pattern Anal Mach Intell IEEE Trans 8(6):679–698
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (BMVC)
Chuang AT, Margo CE, Greenberg PB (2014) Retinal implants: a systematic review. Br J Ophthalmol 98:852–856
Criminisi A, Blake A, Rother C, Shotton J, Torr P (2007) Efficient dense stereo with occlusions for new view-synthesis by four-state dynamic programming. Int J Comput Vision 71(1):89–110
Cummins M, Newman P (2010) Appearance-only SLAM at large scale with FAB-MAP 2.0. The International Journal of Robotics Research
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol 2. pp 886–893
Dame A, Prisacariu VA, Ren CY, Reid I (2013) Dense reconstruction using 3D object shape priors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 1288–1295
Dementhon DF, Davis LS (1995) Model-based object pose in 25 lines of code. Int J Comput Vision 15(1–2):123–141
Dickinson S (2009) The evolution of object categorization and the challenge of image abstraction. In: Dickinson S, Leonardis A, Schiele B, Tarr M (eds) Object categorization: computer and human vision perspectives. Cambridge University Press, pp 1–37
Dickinson S, Levinshtein A, Sala P, Sminchisescu C (2013) The role of mid-level shape priors in perceptual grouping and image abstraction. In: Dickinson S, Pizlo Z (eds), Shape perception in human and computer vision: sn interdisciplinary perspective. Springer
Fang F, Boyaci H, Kersten D (2009) Border ownership selectivity in human early visual cortex and its modulation by attention. J Neurosci 29(2):460–465
Faugeras OD (1993) Three-dimensional computer vision. MIT press, Cambridge
Fidler S, Boben M, Leonardis A (2010) A coarse-to-fine taxonomy of constellations for fast multi-class object detection. In: European Conference on Computer Vision (ECCV)
Freedman DJ, Assad JA (2012) Experience-dependent representation of visual categories in parietal cortex. Nature 443:85–88
Frintrop S, Rome E, Christensen H (2010) Computational visual attention systems and their cognitive foundations: a survey. ACM Trans Appl Percept (TAP) 7(1):1–46
Geman S, Bienenstock E, Doursat R (1995) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58
Girshick R, Donahue J, Darrell T, and Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Gordon I , Lowe DG (2006) What and where: 3D object recognition with accurate pose. In: Ponce J, Hebert M, Schmid C, Zisserman A (eds) Toward category-level object recognition, chapter what and w. Springer, pp 67–82
Hager GD, Wegbreit B (2011) Scene parsing using a prior world model. Int J Robot Res
Hartley RI, Zisserman A (2000) Multiple view geometry in computer vision. University Press, Cambridge
Herbst E, Ren X, Fox D (2011) RGB-D object discovery via multi-scene analysis. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Hinterstoisser S, Cagniart C, Ilic S, Sturm P, Navab N, Fua P, Lepetit V (2012) Gradient response maps for real-time detection of textureless objects. Pattern Anal Mach Intell IEEE Trans 34(5):876–888
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hochberg LR, Bacher D, Jarosiewicz B, Masse NY, Simeral JD, Vogel J, Haddadin JS, Liu J, Cash SS, vander Smagt P, Donoghue JP (2012) Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485:372–375
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160:106–154
Hubel DH, Wiesel TN (1969) Anatomical demonstration of columns in the monkey striate cortex. Nature 221:747–750
Hummel J, Biederman I (1992) Dynamic binding in a neural network for shape recognition. Psychol Rev 99:480–517
Johnson AE, Hebert M (1999) Using spin images for efficient object recognition in cluttered 3d scenes. Pattern Anal Mach Intell IEEE Trans 21(5):433–449
Kayser C, Körding KP, König P (2004) Processing of complex stimuli and natural scenes in the visual cortex. Curr Opin Neurobiol 14(4):468–473
Kellman PJ, Arterberry ME (1998) The cradle of knowledge. MIT Press, Cambridge
Klein G, Murray D (2007) Parallel tracking and mapping for small AR workspaces. In: Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR). Nara, Japan, pp 225–234
König P, Krüger N (2006) Perspectives: symbols as self-emergent entities in an optimization process of feature extraction and predictions. Biol Cybern 94(4):325–334
Kraft D, Pugeault N, Başeski M, Popović M, Kragic D, Kalkan S, Wörgötter F, Krüger N (2009) Birth of the object: detection of objectness and extraction of object shape through object action complexes. Int J Humanoid Robot 5:247–265
Krainin M, Henry P, Ren X, Fox D (2010) Manipulator and object tracking for in Hand Model Acquisition. In: IEEE International Conference on Robotics and Automation (ICRA)
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. In: Advances in neural information processing systems. pp 1–9
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25 (NIPS 2012). Curran Associates Inc, pp 1097–1105
Krüger N, Janssen P, Kalkan S, Lappe M, Leonardis A, Piater J, Rodríguez-Sánchez AJ, Wiskott L (2013) Deep hierarchies in the primate visual cortex: what can we learn for computer vision? Pattern Anal Mach Intell IEEE Trans 35(8):1847–1871
Krüger N, vonder Malsburg C (2015) A required paradigm shift in todays vision research: interview with Prof. Christoph von der Malsburg. Künstliche Intelligenz—special issue on bio-inspired vision systems
Krüger N, Wörgötter F (2004) Statistical and deterministic regularities: utilisation of motion and grouping in biological and artificial visual systems. Adv Imaging Electron Phys 131:82–147
Leung T, Malik J (1998) Contour continuity in region based image segmentation. In: European Conference on Computer Vision (ECCV). pp 544–559
Lowe DG (1987) Three-dimensional object recognition from single two-dimensional images. Artif Intell 31(3):355–395
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 2(60):91–110
Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. Freeman WH
Mian AS, Bennamoun M, Owens R (2006) Three-dimensional model-based object recognition and segmentation in cluttered scenes. Pattern Anal Mach Intell IEEE Trans 28(10):1584–1601
Navalpakkam V, Itti L (2006) An integrated model of top-down and bottom-up attention for optimal object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York, pp 2049–2056
Newcombe RA, Davison AJ (2010) Live dense reconstruction with a single moving camera. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 1498–1505
Newcombe RA, Lovegrove SJ, Davison AJ (2011) DTAM : dense tracking and mapping in real-time. In: IEEE International Conference on Computer Vision (ICCV)
Olshausen BA, Field D (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:607–609
Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) Matching 3D models with shape distributions. In: International Conference on Shape Modeling and Applications (SMI)
Pizzoli M, Forster C, Scaramuzza D (2014) REMODE : probabilistic, monocular dense reconstruction in real time. In: IEEE International Conference on Robotics and Automation (ICRA)
Pugeault N, Wörgötter F, Krüger N (2010) Visual primitives: local, condensed, and semantically rich visual descriptors and their applications in robotics. Int J Humanoid Robot 7(3):379–405
Richtsfeld A, Mörwald T, Prankl J, Zillich M, Vincze M (2014) Learning of perceptual grouping for object segmentation on RGB-D data. J Vis Commun Image Represent 25(1):64–73
Rosten E, Porter R, Drummond T (2010) Faster and better: a machine learning approach to corner detection. Pattern Anal Mach Intell IEEE Trans 32(1):105–119
Rumelhart D, Hinton GE, Williams RJ (1986) Learning representation by back-propagating errors. Nature 323(9):533–536
Russakovsky O, Deng J, Huang Z, Berg AC, Fei-Fei L (2013) Detecting avocados to Zucchinis: what have we done, and where are we going? In: IEEE International Conference on Computer Vision (ICCV). pp 2064–2071
Rusu RB, Bradski G, Thibaux R, Hsu J (2010) Fast 3D recognition and pose using the Viewpoint Feature Histogram. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Sala P, Dickinson S (2010) Contour grouping and abstraction using simple part models. In: European Conference on Computer Vision (ECCV). pp 603–616
Salti S, Tombari F, Di Stefano L (2014) SHOT: unique signatures of histograms for surface and texture description. Comput Vision Image Underst 125:251–264
Sarkar S, Boyer KL (1993) Perceptual organization in computer vision: a review and a proposal for a classificatory structure. IEEE Trans Syst Man Cybern 23(2):382–399
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: European Conference on Computer Vision (ECCV). pp 746–760
Sinha SN, Scharstein D, Szeliski R (2014) Efficient high-resolution stereo matching using local plane sweeps. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 1582–1589
Stuehmer J, Gumhold S, Cremers D (2010) Real-time dense geometry from a handheld camera. In: Proceedings of the DAGM Symposium on Pattern Recognition
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: International Conference on Learning Representations (ICLR)
Tanaka K (1993) Neuronal mechanisms of object recognition. Science 262:685–688
Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide-baseline stereo. Pattern Anal Mach Intell IEEE Trans 32(5):815–830
Tombari F, Salti S, Di Stefano L (2010) Unique signatures of histograms for local surface description. In: European Conference on Computer Vision (ECCV). Springer, pp 356–369
Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Found Trends Comput Graph Vision 3(3):1–104
Ückermann A, Elbrechter C, Haschke R, Ritter H (2014) Real-time hierarchical scene segmentation and classification. In: IEEE-RAS International Conference on Humanoid Robots (Humanoids)
Vapnik VN (1998) Stat Learn Theory. Adaptive and learning systems for signal processing. Wiley, New-York
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154
vonder Heydt R, Peterhans E, Baumgartner G (1984) Illusory contours and cortical neuron responses. Science 224:1260–1262
Wagemans J, Elder JH, Kubovy M, Palmer SE, Peterson MA, Singh M, vonder Heydt R (2012) A century of gestalt psychology in visual perception: I. perceptual grouping and figure-ground organization. Psychol Bull 138(6)
Wendel A, Maurer M, Graber G, Pock T, Bischof H (2012) Dense reconstruction on-the-fly. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 1450–1457
Acknowledgments
Norbert Krüger was supported by the EU Cognitive Systems project XPERIENCE (FP7-ICT-270273) and the DSF project patient@home. Michael Zillich was supported by EU projects SQUIRREL (FP7-ICT-610532) and STRANDS (FP7-ICT-600623) and Austrian Science Fund (FWF) grant No. TRP 139-N23 InSitu. Many thanks to Antonio Rodriguez Sanchez for his work on Fig. 1 and Laurenz Wiskott for his work on Fig. 2. Many thanks also to IEEE for allowing to re-use these figures from [48].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Krüger, N., Zillich, M., Janssen, P. et al. What We Can Learn From the Primate’s Visual System. Künstl Intell 29, 9–18 (2015). https://doi.org/10.1007/s13218-014-0345-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13218-014-0345-9