Abstract
High-level semantic features and low-level detail features matter for salient object detection in fully convolutional neural networks (FCNs). Further integration of low-level and high-level features increases the ability to map salient object features. In addition, different channels in the same feature are not of equal importance to saliency detection. In this paper, we propose a residual attention learning strategy and a multistage refinement mechanism to gradually refine the coarse prediction in a scale-by-scale manner. First, a global information complementary (GIC) module is designed by integrating low-level detailed features and high-level semantic features. Second, to extract multiscale features of the same layer, a multiscale parallel convolutional (MPC) module is employed. Afterwards, we present a residual attention mechanism module (RAM) to receive the feature maps of adjacent stages, which are from the hybrid feature cascaded aggregation (HFCA) module. The HFCA aims to enhance feature maps, which reduce the loss of spatial details and the impact of varying the shape, scale and position of the object. Finally, we adopt multiscale cross-entropy loss to guide network learning salient features. Experimental results on six benchmark datasets demonstrate that the proposed method significantly outperforms 15 state-of-the-art methods under various evaluation metrics.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wei YC, Liang XD, Chen YP, et al. (2016) Stc: A simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(11):2314–2320
Li YW, Chen XZ, Zhu Z et al (2019) Attention-guided unified network for panoptic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7026–7035
Fu J, Liu J, Tian HJ et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3146–3154
Gao P, Zhang QQ, Xiao LY et al (2020) Learning reinforced attentional representation for end-to-end visual tracking. Information Sciences 517:52–67
Zhang PP, Liu W, Wang D et al (2020) Non-rigid object tracking via deep multi-scale spatial-temporal discriminative saliency maps. Pattern Recognition 100:107130
Yu LL, Jin MY, Zhou KJ (2020) Multi-channel biomimetic visual transformation for object feature extraction and recognition of complex scenes. Appl Intell 50(3):792–811
Cai ZW, Cascade VN (2018) r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Pérez-Hernández F, Tabik S, Lamas A et al (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance, vol 194
Guo F, Wang WG, Shen JB, et al. (2017) Video saliency detection using object proposals. IEEE Trans Cybern 48(11):3159–3170
Wang WG, Shen JB, Shao L (2017) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Cheng MM, Mitra NJ, Huang XL, et al. (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Li JX, Luo L, Zhang FL, Yang J, et al. (2016) Double low rank matrix recovery for saliency fusion. IEEE Trans Image Process 25(9):4421–4432
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Zhang PP, Wang D, Lu HC et al (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE international conference on computer vision, pp 212–221
Zhao T, Wu XQ (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3085–3094
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Zhang XN, Wang TT, Qi JP et al (2018) Progressive attention guided recurrent network for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 714–722
Li JX, Pan ZF, Liu QS et al (2020) Stacked u-shape network with channel-wise attention for salient object detection[J]. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2020.2997192
He KM, Zhang XY, Ren SP et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Chen ZY, Xu QQ, Cong RM et al (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 10599–10606
Deng ZJ, Hu XW, Zhu L et al (2018) R3net: Recurrent residual refinement network for saliency detection. In: Proceedings of the 27th international joint conference on artificial intelligence, AAAI Press, pp 684–690
Yang C, Zhang LH, Lu HC et al (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173
Borji A, Cheng MM, Jiang HZ, et al. (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12):5706–5722
Li GB, Yu YZ (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5455–5463
Liu N, Han JW, Yang MH (2018) Picanet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3089–3098
Li GB, Yu YZ (2016) Deep contrast learning for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 478–487
Lee G, Tai YW, Kim J (2016) Deep saliency with encoded low level distance map and high level features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 660–668
Wang B, Chen Q, Zhou M et al (2020) Progressive feature polishing network for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 12128–12135
Hou QB, Cheng MM, Hu XW et al (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3203–3212
Jiao J, Xue H, Ding JD (2021) Non-local duplicate pooling network for salient object detection[J]. Applied Intelligence. https://doi.org/10.1007/s10489-020-02147-8
Zhang PP, Wang D, Lu HC et al (2017) Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE international conference on computer vision, pp 202–211
Li GB, Xie Y, Lin L et al (2017) Instance-level salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2386–2395
Liu JJ, Hou QB, Cheng MM et al (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3917–3926
Wu Y, Jiang XY, Fang ZJ et al (2021) Multi-modal 3D object detection by 2D-guided precision anchor proposal and multi-layer fusion. Appl Soft Comput 108:107405
Shen C, Qi GJ, Jiang RX, et al. (2018) Sharp attention network via adaptive sampling for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 29(10):3016–3027
Jin L, Shu XB, Li K, et al. (2018) Deep ordinal hashing with spatial attention. IEEE Trans Image Process 28(5):2173–2186
Zhang LH, Singh V, Qi GJ et al (2019) Cascade attention machine for occluded landmark detection in 2d x-ray angiography. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 91–100
Wang WG, Zhao SY, Shen JB et al (2019) Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1448–1457
Gao P, Yuan RY, Wang F et al (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Systems 193:105448
Zhang Q, Shi YJ, Zhang XQ et al (2020) Attention and boundary guided salient object detection. Pattern Recognition 107(7):107484. pp 234–250
Chen SH, Tan XL, Wang B et al (2018) Reverse attention for salient object detection. In: Proceedings of the European conference on computer vision, pp 234–250
Zhuge YZ, Yang G, Zhang PP, et al. (2018) Boundary-guided feature aggregation network for salient object detection. IEEE Signal Process Lett 25(12):1800–1804
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Liu ST, Huang D et al (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision, pp 385–400
Deng J, Dong W, Socher R et al (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 248–255
Wang LJ, Lu HC, Wang YF et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145
Yan Q, Xu L, Shi JP et al (2013) Hierarchical saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1155–1162
Li Y, Hou XD, Koch C et al (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287
Movahedi V, Elder JH (2010) Design and perceptual validation of performance measures for salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–56
Fan DP, Cheng MM, Liu Y et al (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp 4548– 4557
Luo ZM, Mishra A, Achkar A et al (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6609–6617
Qin XB, Zhang ZC, Huang CY et al, 2019 Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7479–7489
Tu ZZ, Ma Y, Li CL, et al. (2021) Edge-guided non-local fully convolutional network for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology 31(2):582–593
Liang YH, Qin GH, Sun M, et al. (2021) Mafnet: Multi-style attention fusion network for salient object detection. Neurocomputing 422:22–33
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No.62002100), the National Natural Science Foundation of China (No.61802111) and the Science and Technology Foundation of Henan Province of China (No.212102210156).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, J., Zhao, Z., Yang, S. et al. Global contextual guided residual attention network for salient object detection. Appl Intell 52, 6208–6226 (2022). https://doi.org/10.1007/s10489-021-02713-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02713-8