SA-DETR: Saliency Attention-based DETR for salient object detection | Pattern Analysis and Applications Skip to main content
Log in

SA-DETR: Saliency Attention-based DETR for salient object detection

  • Original Paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Researches on the Salient Object Detection (SOD) task have made many advances based on deep learning methods. However, most methods have focused on predicting a fine mask rather than finding the most salient objects. Most datasets for the SOD task also focus on evaluating pixel-wise accuracy rather than “saliency”. In this study, we used the Salient Objects in Clutter (SOC) dataset to conduct research that focuses more on the saliency of objects. We propose a architecture that extends the cross-attention mechanism of Transformer to the DETR architecture to learn the relationship between the global image semantics and the objects. We extended module with Saliency Attention (SA) to the network, namely SA-DETR, to detect salient objects based on object-level saliency. Our proposed method with cross- and saliency-attentions shows superior results in detecting salient objects among multiple objects compared to other methods. We demonstrate the effectiveness of our proposed method by showing that it outperforms the state-of-the-art performance of the existing SOD method by 4.7% and 0.2% in MAE and mean E-measure, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

All data generated or analysed during this study are included in the published article [9], and the authors confirm that the datasets are indicated in the reference list.

References

  1. Achanta R, Hemami S, Estrada F, et al (2009) Frequency-tuned salient region detection. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1597–1604

  2. Brahim K, Kalboussi R, Abdellaoui M et al (2019) Spatio-temporal saliency detection using objectness measure. Signal, Image Video Process 13:1055–1062

    Article  MATH  Google Scholar 

  3. Carion N, Massa F, Synnaeve G, et al (2020) End-to-end object detection with transformers. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer, pp 213–229

  4. Chen Q, Wang J, Han C et al (2022) Group detr v2: Strong object detector with encoder-decoder pretraining. arXiv preprint arXiv:2211.03594

  5. Cheng MM, Zhang Z, Lin WY et al (2014) Bing: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3286–3293

  6. Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  7. Fan DP, Cheng MM, Liu Y, et al (2017) Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE international conference on computer vision, pp 4548–4557

  8. Fan DP, Gong C, Cao Y et al (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421

  9. Fan DP, Zhang J, Xu G et al (2022) Salient objects in clutter. IEEE Trans Pattern Anal Mach Intell 45(2):2344–2366

    Article  MATH  Google Scholar 

  10. Fang Y, Wang W, Xie B et al (2022) Eva: Exploring the limits of masked visual representation learning at scale. arXiv preprint arXiv:2211.07636

  11. Harel J, Koch C, Perona P (2006) Graph-based visual saliency. Advances in neural information processing systems 19

  12. Hou Q, Cheng MM, Hu X et al (2019) Deeply supervised salient object detection with short connections. IEEE TPAMI 41(4):815–828. https://doi.org/10.1109/TPAMI.2018.2815688

    Article  MATH  Google Scholar 

  13. Hou X, Zhang L (2007) Saliency detection: A spectral residual approach. In: 2007 IEEE Conference on computer vision and pattern recognition. IEEE, pp 1–8

  14. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  MATH  Google Scholar 

  15. Li G, Yu Y (2016) Deep contrast learning for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 478–487

  16. Li Y, Hou X, Koch C et al (2014) The secrets of salient object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 280–287

  17. Liu JJ, Hou Q, Cheng MM et al (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3917–3926

  18. Liu N, Zhang N, Wan K et al (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732

  19. Liu Y, Cheng MM, Hu X et al (2017) Richer convolutional features for edge detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3000–3009

  20. Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

  21. Luo Z, Mishra A, Achkar A et al (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 6609–6617

  22. Nguyen T (2015) Salient object detection via objectness proposals. In: Proceedings of the AAAI Conference on Artificial Intelligence

  23. Pan J, Sayrol E, Nieto XG et al (2017) Salgan: Visual saliency prediction with adversarial networks. In: CVPR scene understanding workshop (SUNw)

  24. Perazzi F, Krähenbühl P, Pritch Y et al (2012) Saliency filters: contrast based filtering for salient region detection. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 733–740

  25. Qin X, Zhang Z, Huang C et al (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7479–7489

  26. Qin X, Zhang Z, Huang C et al (2020) U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recognit 106:107404

    Article  MATH  Google Scholar 

  27. Srivatsa RS, Babu RV (2015) Salient object detection via objectness measure. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 4481–4485

  28. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Advances in neural information processing systems 30

  29. Wang L, Lu H, Wang Y et al (2017) Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 136–145

  30. Wei J, Wang S, Huang Q (2020) \(\text{F}^3\)net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI conference on artificial intelligence, pp 12321–12328

  31. Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7264–7273

  32. Yang C, Zhang L, Lu H et al (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173

  33. Zaidi SSA, Ansari MS, Aslam A et al (2022) A survey of modern deep learning based object detection models. Digit Signal Process 126:103514

    Article  MATH  Google Scholar 

  34. Zhang J, Fan DP, Dai Y et al (2021) Uncertainty inspired rgb-d saliency detection. IEEE Trans Pattern Anal Mach Intell 44(9):5761–5779

    MATH  Google Scholar 

  35. Zhang P, Wang D, Lu H et al (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE International Conference on computer vision, pp 212–221

  36. Zhao JX, Liu JJ, Fan DP et al (2019) Egnet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8779–8788

  37. Zhuge M, Fan DP, Liu N et al (2022) Salient object detection via integrity learning. IEEE Trans Pattern Anal Mach Intell 45(3):3738–52

    MATH  Google Scholar 

  38. Zong Z, Song G, Liu Y (2022) Detrs with collaborative hybrid assignments training. arXiv preprint arXiv:2211.12860

Download references

Acknowledgements

This work was supported by the Soongsil University Research Fund (New Professor Support Research) of 2021.

Funding

Soongsil University, New Professor Support Research of 2021, Minyoung Chung.

Author information

Authors and Affiliations

Authors

Contributions

Kwangwoon Nam: Methodology, Software, Investigation, Data curation, Writing - original draft. Jeeheon Kim: Conceptualization, Supervision, Writing-review. Heeyeon Kim: Experiments. Minyoung Chung: Conceptualization, Resources, Writing - review & editing, Supervision, Project administration, Funding acquisition.

Corresponding author

Correspondence to Minyoung Chung.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and informed consent for data used

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nam, K., Kim, J., Kim, H. et al. SA-DETR: Saliency Attention-based DETR for salient object detection. Pattern Anal Applic 28, 5 (2025). https://doi.org/10.1007/s10044-024-01379-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10044-024-01379-5

Keywords