Exploring class-agnostic pixels for scribble-supervised high-resolution salient object detection

Yang, Qingpeng; Zhou, Yi; Chai, Xiuli; Zhang, Miaohui; Zhang, Wanjun; Wang, Jun

doi:10.1007/s00521-022-07915-w

Exploring class-agnostic pixels for scribble-supervised high-resolution salient object detection

Original Article
Published: 12 October 2022

Volume 35, pages 3469–3482, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Qingpeng Yang¹,
Yi Zhou¹,
Xiuli Chai¹,
Miaohui Zhang¹,
Wanjun Zhang¹ &
…
Jun Wang ORCID: orcid.org/0000-0002-6257-6641¹

363 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Successful salient object detection is largely dependent on large-scale fine-grained annotated datasets. However, pixel-level annotation is a laborious process compared with weak labels and scant research has been done on high-resolution images. To mitigate these drawbacks, we propose a distinctive network to explore salient object in high-resolution images under scribble-supervised and relabel a previous high-resolution dataset with scribbles, namely Scr-HRSOD, in which each image is labelled in a few seconds. Since scribble labels lack structural information about objects, a boundary structure maintenance branch with shallow layers is introduced to capture low-level spatial details. Within the constraint of boundary branches, a lightweight contextual semantic branch process compressed inputs to obtain high-level semantic context and iteratively propagates the partially annotated pixels to surrounding similar regions, which are then employed as pseudo-labels to supervise the network. Extensive evaluations on five datasets illustrate the effectiveness of our introduced method. On HRSOD datasets, we achieve higher 0.861 F^max and 0.887 S_m values, which outperforms the existing foremost weakly supervised methods and even the fully supervised methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A dual-stream learning framework for weakly supervised salient object detection with multi-strategy integration

Article 23 January 2025

WUSL–SOD: Joint weakly supervised, unsupervised and supervised learning for salient object detection

Article 19 April 2023

Scribble-attention hierarchical network for weakly supervised salient object detection in optical remote sensing images

Article 05 October 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The annotation tool we employed is the scribble annotation tool in Image Labeler in Matlab R2019b.
The Scr-HRSOD datasets are publicly available: https://github.com/YQP-CV/Scribble-Supervised-HRSOD and our code is about to be open source.

References

Shon AP, Grimes DB, Baker CL, et al. (2005) Probabilistic gaze imitation and saliency learning in a robotic head. In: Proceedings of the IEEE International Conference on Robotics and Automation 2865–2870
Zhi H, Shen J, Hong B (2018) Saliency driven region-edge-based top down level set evolution reveals the asynchronous focus in image segmentation. Pattern Recognit: J Pattern Recognit Soc 80:241–255
Article Google Scholar
Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. International conference on machine learning 597–606
Shen JB, Peng JT, Shao L (2018) Submodular trajectories for better motion segmentation in videos. IEEE Trans Image Proc 27(6):2688–2700
Article MATH Google Scholar
Wang WG, Shen JB, Ling HB (2018) A deep network solution for attention and aesthetics aware photo ropping. IEEE Trans Pattern Anal Mach Intell 41(7):1531–1544
Article Google Scholar
Luo ZM, Mishra A, Achkar et al. (2017) Non-local deep features for salient object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 6609–6617
Liu N, Han JW, Yang MH et al. (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 684–690
Zeng Y, Zhang PP, Zhang JM, et al. (2019) Towards high-resolution salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision 7234–7243
Zhang P, Liu W, Zeng Y et al (2021) Looking for the detail and context devils: high-resolution salient object detection. IEEE Trans Image Proc 99:1–1
Google Scholar
Wang L, Lu H, Wang Y, et al. (2017) Learning to detect salient objects with image-level supervision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 3796–3805. https://doi.org/10.1109/CVPR.2017.404
Qian M, Qi J, Zhang L et al (2019) Language-aware weak supervision for salient object detection. Pattern Recognit 96:106955
Article Google Scholar
Y Zeng, Y Zhuge, H Lu, et al. (2019) Multi-source weak supervision for saliency detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 6067–6076. https://doi.org/10.1109/CVPR.2019.00623
Zhang J, Yu X, Li A, et al. (2020) Weakly-supervised salient object detection via scribble annotations. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 12546–12555
Yu C, Wang J, Peng C, et al. (2018) BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 325–341
Yu C, Gao C, Wang J, et al. (2020) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation, arXiv preprint arXiv: 2004.02147 [cs.CV]
Zhao H , Qi X , Shen X , et al. (2017) ICNet for Real-Time Semantic Segmentation on High-Resolution Images. In: Proceedings of the European conference on computer vision (ECCV), p. 405–420
Poudel R, Liwicki S, Cipolla R. (2019) Fast-SCNN: fast semantic segmentation network, arXiv preprint arXiv:1902.04502
Poudel R, Bonde U, Liwicki S, et al. (2018) ContextNet: exploring context and detail for semantic segmentation in real-time, arXiv preprint arXiv:1805.04554
Sandler M, Howard A, Zhu M, et al. (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 4510–4520
Zhang X, Zhou X, Lin M, et al. (2017) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 6848–6856
Ma N, Zhang X, Zheng H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision (ECCV), p. 116–131
Iandola, Forrest N., et al. (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv: 1602.07360
Long, J, Shelhamer E, Darrell T. (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 3431–3440
Ronneberger O, Fischer P, Brox T. (2015) U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, p. 234–241
Wang WG, Lai QX, Fu HZ, Shen JB, Ling HB. (2021) Salient object detection in the deep learning era: an in-depth survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 220–232
Howard, Andrew G, et al. (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: 1704.04861
Lin G, Milan A, Shen C, et al. (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 5168–5177
Wang J, Yang QP, Yang SQ et al (2022) Dual-path processing network for high-resolution salient object detection. Appl Intell. https://doi.org/10.1007/s10489-021-02971-6
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 770–778
Huang G, Liu Z, Maaten LV, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 4700–4708
Jia D, Wei D, Socher R, et al. (2009) ImageNet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, p. 248–255
Siva, P Russell C, Xiang T, Agapito L (2013) Looking beyond the image: Unsupervised learning for object saliency and detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 3238–3245
Bearman A, Russakovsky O, Ferrari V, et al. (2016) What's the Point: Semantic Segmentation with Point Supervision. Springer, Cham, p. 549–565
Chen LC, Papandreou G, Kokkinos I et al (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Boykov, Yuri Y, M-P Jolly (2001) Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In: Proceedings eighth IEEE international conference on computer vision. ICCV 2001. IEEE, p. 105–112
Liu Y, Cheng M, M Hu, et al. (2017) Richer convolutional features for edge detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 3000–3009
Chen LC, Papandreou G, Schroff F, et al. (2017) Rethinking atrous convolution for semantic image segmentation, arXiv preprint arXiv:1706.05587
Fan MY, Huang SQ, Wei XM, et al. (2021) Rethinking BiSeNet For Real-time Semantic Segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 9716–9725
Zhao J X, Liu J J, Fan D P, et al. (2019) EGNet: Edge guidance network for salient object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, p. 8779–8788
Tang M, Djelouah A, Perazzi F, et al. (2018) Normalized cut loss for weakly-supervised cnn segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1818–1827
Yan Q, Xu L, Shi JP, Jia JY (2013) Hierarchical saliency detection. Computer Vision and Pattern Recognition (CVPR). In: 2013 IEEE Conference, p. 1155–1162
Li GB, Yu YZ (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 5455–5463
Wolfgang E, Peter K (2015) Does luminance-contrast contribute to a saliency map for overt visual attention? Eur J Neurosci 17(5):1089–1097
Google Scholar
Wang LJ, Lu HC, Wang YF, Mengyang Feng (2017) Learning to detect salient objects with image-level supervision. In: IEEE Conference on Computer Vision & Pattern Recognition, p. 136–145
Zhang PP, Wang D, Lu HC, Wang HY (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, p. 202–211
Zhang D, Han J, Zhang Y. (2017) Supervision by fusion: Towards unsupervised learning of deep salient object detector. In: Proceedings of the IEEE International Conference on Computer Vision, p. 4048–4056
Kingma D P, Ba J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Lin D, Dai JF, Jia JY, et al. (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 3159–3167
Wang B, Qi GJ, Tang S, et al. (2019) Boundary perception guidance: A scribble-supervised semantic segmentation approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p. 3663–3669

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (No.62002100), the National Natural Science Foundation of China (No.61802111) and the Science and Technology Foundation of Henan Province of China (No.212102210156). National Natural Science Foundation of China (No.62176088).

Author information

Authors and Affiliations

School of Artificial Intelligence, Henan University, Kaifeng, 475004, China
Qingpeng Yang, Yi Zhou, Xiuli Chai, Miaohui Zhang, Wanjun Zhang & Jun Wang

Authors

Qingpeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiuli Chai
View author publications
You can also search for this author in PubMed Google Scholar
Miaohui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wanjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Q., Zhou, Y., Chai, X. et al. Exploring class-agnostic pixels for scribble-supervised high-resolution salient object detection. Neural Comput & Applic 35, 3469–3482 (2023). https://doi.org/10.1007/s00521-022-07915-w

Download citation

Received: 19 December 2021
Accepted: 30 September 2022
Published: 12 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00521-022-07915-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Exploring class-agnostic pixels for scribble-supervised high-resolution salient object detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A dual-stream learning framework for weakly supervised salient object detection with multi-strategy integration

WUSL–SOD: Joint weakly supervised, unsupervised and supervised learning for salient object detection

Scribble-attention hierarchical network for weakly supervised salient object detection in optical remote sensing images

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now