RGB-D Saliency Detection by Multi-stream Late Fusion Network | SpringerLink
Skip to main content

RGB-D Saliency Detection by Multi-stream Late Fusion Network

  • Conference paper
  • First Online:
Computer Vision Systems (ICVS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10528))

Included in the following conference series:

  • 4576 Accesses

Abstract

In this paper we aim to address the problem of saliency detection on RGB-D image pairs based on a multi-stream late fusion network. With the prevalence of RGB-D sensors, leveraging additional depth information to facilitate saliency detection task has drawn increasing attention. However, the key challenge that how to fuse RGB data and depth data in an optimum manner is still under-studied. Conventional wisdom simply regards depth information as an undifferentiated channel and models RGB-D saliency detection by using existing RGB saliency detection models directly. However, this paradigm is incapable of capturing specific representations in depth modality and also powerless in fusing multi-modal information. In this paper, we address this problem by proposing a simple yet principled late fusion strategy carried out in conjunction with convolutional neural networks (CNNs). The proposed network is able to learn discriminant representations and explore the complementarity between RGB and depth modalities. Comprehensive experiments on two public datasets witness the benefits of the proposed RGB-D saliency detection network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cheng, M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)

    Article  Google Scholar 

  2. Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010)

    Article  MathSciNet  Google Scholar 

  3. Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)

    Article  Google Scholar 

  4. Yang, J., Yang, M.-H.: Top-down visual saliency via joint CRF and dictionary learning. In: CVPR 2012, pp. 2296–2303 (2012)

    Google Scholar 

  5. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS 2007, pp. 545–552 (2007)

    Google Scholar 

  6. Zhang, Y., Han, J., Guo, L.: Saliency detection by combining spatial and spectral information. Opt. Lett. 38(11), 1987–1989 (2013)

    Article  Google Scholar 

  7. Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: CVPR 2012, pp. 454–461 (2012)

    Google Scholar 

  8. Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 92–109. Springer, Cham (2014). doi:10.1007/978-3-319-10578-9_7

    Google Scholar 

  9. Ren, J., Gong, X., Yu, L., Zhou, W., Yang, M.Y.: Exploiting global priors for RGB-D saliency detection. In: CVPR Workshop 2015, pp. 25–32 (2015)

    Google Scholar 

  10. Cheng, Y., Fu, H., Wei, X., Xiao, J., Cao, X.: Depth enhanced saliency detection method. In: ICIMCS 2014, p. 23 (2014)

    Google Scholar 

  11. Ciptadi, A., Hermans, T., Rehg, J.M.: An in depth view of saliency. In: BMVC 2013, pp. 9–13 (2013)

    Google Scholar 

  12. Desingh, K., Krishna, K.M., Rajan, D., Jawahar, C.V.: Depth really matters: improving visual salient region detection with depth. In: BMVC 2013 (2013)

    Google Scholar 

  13. Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: ICIP 2014, pp. 1115–1119 (2014)

    Google Scholar 

  14. Lang, C., Nguyen, T.V., Katti, H., Yadati, K., Kankanhalli, M., Yan, S.: Depth matters: influence of depth cues on visual saliency. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 101–115. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_8

    Chapter  Google Scholar 

  15. Fan, X., Liu, Z., Sun, G.: Salient region detection for stereoscopic images. In: DSP 2014, pp. 454–458 (2014)

    Google Scholar 

  16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS 2012, pp. 1097–1105 (2012)

    Google Scholar 

  17. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM 2014, pp. 675–678 (2014)

    Google Scholar 

  18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR 2015, pp. 3431–440 (2015)

    Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  20. Feng, D., Barnes, N., You, S., McCarthy, C.: Local background enclosure for RGB-D salient object detection. In: CVPR 2016, pp. 2343–2350 (2016)

    Google Scholar 

  21. Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., Yang, Q.: RGBD salient object detection via deep fusion. IEEE Trans. Image Process. 26(5), 2274–2285 (2017)

    Article  MathSciNet  Google Scholar 

  22. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)

  23. Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014). doi:10.1007/978-3-319-10584-0_23

    Google Scholar 

  24. Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)

    Article  Google Scholar 

Download references

Acknowledgments

This work is funded by the Research Grants Council of Hong Kong (CityU 11205015) and the National Natural Science Foundation of China (NSFC) (61673329).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youfu Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Chen, H., Li, Y., Su, D. (2017). RGB-D Saliency Detection by Multi-stream Late Fusion Network. In: Liu, M., Chen, H., Vincze, M. (eds) Computer Vision Systems. ICVS 2017. Lecture Notes in Computer Science(), vol 10528. Springer, Cham. https://doi.org/10.1007/978-3-319-68345-4_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68345-4_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68344-7

  • Online ISBN: 978-3-319-68345-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics