MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation
Abstract
:1. Introduction
- We propose a strategy that combines the multi-attention hybrid mechanism with superpixel segmentation. Through three-stage multi-attention fusion, we achieved fine-grained feature extraction, efficient deep semantic map reconstruction, and semantic feature enhancement in upsampling. This strategy addressed the issue of detail loss in the encoding–decoding stage of existing superpixel algorithms.
- Our multi-attention hybrid network for superpixel segmentation could focus on both the semantic and spatial information contained in the input image, thus generating superpixels with more semantic awareness.
- Experimental results on various visual task datasets show the excellent performance of the proposed method in superpixel segmentation, particularly in generating superpixels with better boundary adherence.
2. Related Work
2.1. Traditional Superpixel Methods
2.2. Deep Superpixel Methods
3. Methodology
3.1. Encoder with Parameterless Attention ResBlock
3.2. Feature Map Reconstruction Based on Global Semantic Fusion Self-Attention
3.3. Decoder with CAM/SAM
4. Experiments and Results
4.1. Datasets
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Comparison with the State-of-the-Art Methods
4.5. Efficiency Analysis
4.6. Ablation Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ren, X.; Malik, J. Learning a classification model for segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Nice, France, 13–16 October 2003. [Google Scholar]
- Kim, S.; Park, D.; Shim, B. Semantic-aware superpixel for weakly supervised semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Washington, DC, USA, 7–14 February 2023. [Google Scholar]
- Lei, T.; Jia, X.; Zhang, Y.; Liu, S.; Meng, H.; Nandi, A.K. Superpixel-based fast fuzzy C-means clustering for color image segmentation. IEEE Trans. Fuzzy Syst. 2019, 27, 1753–1766. [Google Scholar] [CrossRef]
- Zhang, S.; Ma, Z.; Zhang, G.; Lei, T.; Zhang, R.; Cui, Y. Semantic image segmentation with deep convolutional neural networks and quick shift. Symmetry 2020, 12, 427. [Google Scholar] [CrossRef]
- Liu, M.; Chen, S.; Lu, F.; Xing, M.; Wei, J. Realizing target detection in SAR images based on multiscale superpixel fusion. Sensors 2021, 21, 1643. [Google Scholar] [CrossRef] [PubMed]
- Huang, C.; Zong, Y.; Ding, Y.; Luo, X.; Clawson, K.; Peng, Y. A new deep learning approach for the retinal hard exudates detection based on superpixel multi-feature extraction and patch-based CNN. Neurocomputing 2021, 452, 521–533. [Google Scholar] [CrossRef]
- Mu, C.; Dong, Z.; Liu, Y. A two-branch convolutional neural network based on multi-spectral entropy rate superpixel segmentation for hyperspectral image classification. Remote Sens. 2022, 14, 1569. [Google Scholar] [CrossRef]
- Wei, W.; Chen, W.; Xu, M. Co-saliency detection of RGBD image based on superpixel and hypergraph. Symmetry 2022, 14, 2393. [Google Scholar] [CrossRef]
- Rout, R.; Parida, P.; Alotaibi, Y.; Alghamdi, S.; Khalaf, O.I. Skin lesion extraction using multiscale morphological local variance reconstruction based watershed transform and fast fuzzy C-means clustering. Symmetry 2021, 13, 2085. [Google Scholar] [CrossRef]
- Liu, M.-Y.; Tuzel, O.; Ramalingam, S.; Chellappa, R. Entropy rate superpixel segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011. [Google Scholar]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed]
- Machairas, V.; Faessel, M.; Cardenas-Pena, D.; Chabardes, T.; Walter, T.; Decenciere, E. Waterpixels. IEEE Trans. Image Process. 2015, 24, 3707–3716. [Google Scholar] [CrossRef] [PubMed]
- Jampani, V.; Sun, D.; Liu, M.-Y.; Yang, M.-H.; Kautz, J. Superpixel sampling networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Yang, F.; Sun, Q.; Jin, H.; Zhou, Z. Superpixel segmentation with fully convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Wang, Y.; Wei, Y.; Qian, X.; Zhu, L.; Yang, Y. AINet: Association implantation for superpixel segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Xu, S.; Wei, S.; Ruan, T.; Zhao, Y. ESNet: An efficient framework for superpixel segmentation. IEEE Trans. Circ. Syst. Vid. 2023, 34, 5389–5399. [Google Scholar] [CrossRef]
- Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. 2011, 33, 898–916. [Google Scholar] [CrossRef] [PubMed]
- Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In Proceedings of the European Conference on Computer Vision (ECCV), Firenze, Italy, 7–13 October 2012. [Google Scholar]
- Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient graph-based image segmentation. Int. J. Comput. Vision 2004, 59, 167–181. [Google Scholar] [CrossRef]
- Li, Z.; Chen, J. Superpixel segmentation using linear spectral clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Liu, Y.-J.; Yu, C.-C.; Yu, M.-J.; He, Y. Manifold SLIC: A fast method to compute content-sensitive superpixels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Yao, J.; Boben, M.; Fidler, S.; Urtasun, R. Real-time coarse-to-fine topologically preserving segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Yuan, Y.; Zhu, Z.; Yu, H.; Zhang, W. Watershed-based superpixels with global and local boundary marching. IEEE Trans. Image Process. 2020, 29, 7375–7388. [Google Scholar] [CrossRef]
- Tu, W.-C.; Liu, M.-Y.; Jampani, V.; Sun, D.; Chien, S.-Y.; Yang, M.-H.; Kautz, J. Learning superpixels with segmentation-aware affinity loss. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Zhao, T.; Peng, B.; Sun, Y.; Yang, D.; Zhang, Z.; Wu, X. Rethinking superpixel segmentation from biologically inspired mechanisms. Appl. Soft. Comput. 2024, 156, 111467. [Google Scholar] [CrossRef]
- Xu, S.; Wei, S.; Ruan, T.; Liao, L. Learning invariant inter-pixel correlations for superpixel generation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada, 20–27 February 2024. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. SimAM: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning (ICML), Virtual, 18–24 July 2021. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Katharopoulos, A.; Vyas, A.; Pappas, N.; Fleuret, F. Transformers are RNNs: Fast autoregressive transformers with linear attention. In Proceedings of the International Conference on Machine Learning (ICML), Virtual, 12–18 July 2020. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Gould, S.; Fulton, R.; Koller, D. Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009. [Google Scholar]
- Abu Alhaija, H.; Mustikovela, S.K.; Mescheder, L.; Geiger, A.; Rother, C. Augmented reality meets computer vision: Efficient data generation for urban driving scenes. Int. J. Comput. Vision 2018, 126, 961–972. [Google Scholar] [CrossRef]
- Ji, S.; Wei, S.; Lu, M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 2019, 57, 574–586. [Google Scholar] [CrossRef]
- Staal, J.; Abramoff, M.D.; Niemeijer, M.; Viergever, M.A.; Van Ginneken, B. Ridge-based vessel segmentation in color images of the retina. IEEE Trans. Med. Imaging 2004, 23, 501–509. [Google Scholar] [CrossRef] [PubMed]
- Stutz, D.; Hermans, A.; Leibe, B. Superpixels: An evaluation of the state-of-the-art. Comput. Vis. Image Und. 2018, 166, 1–27. [Google Scholar] [CrossRef]
Method | Pub., Year | Implementation | Category | Citations (July 2024) |
---|---|---|---|---|
NC [1] | ICCV, 2003 | MatLab/C | Graph | 2367 |
FH [19] | IJCV, 2004 | C/C++ | Graph | 8455 |
ERS [10] | CVPR, 2011 | C/C++ | Graph | 1193 |
SLIC [11] | TPAMI, 2012 | C/C++ | Clustering | 10,408 |
LSC [20] | CVPR, 2015 | C/C++ | Clustering | 568 |
MSLIC [21] | CVPR, 2016 | MatLab/C | Clustering | 157 |
Waterpixels [12] | TIP, 2015 | Python | Gradient | 124 |
ETPS [22] | CVPR, 2015 | MatLab/C | Gradient | 162 |
WSGL [23] | TIP, 2020 | C/C++ | Gradient | 19 |
SEAL [24] | CVPR 2018 | Python | Deep learning | 135 |
SSN [13] | ECCV, 2018 | Python | Deep learning | 261 |
SCN [14] | CVPR, 2020 | Python | Deep learning | 228 |
AINet [15] | ICCV, 2021 | Python | Deep learning | 36 |
ESNet [16] | TCSVT, 2023 | Python | Deep learning | 2 |
BINet [25] | ASC, 2024 | Python | Deep learning | 1 |
CDS [26] | AAAI, 2024 | Python | Deep learning | 0 |
Method | Pub., Year | ASA (%) ↑ | BP (%) ↑ | BR (%) ↑ | UE(%) ↓ | CO (%) ↑ |
---|---|---|---|---|---|---|
SLIC [11] | TPAMI, 2012 | 95.60 | 11.17 | 82.91 | 4.40 | 27.65 |
LSC [20] | CVPR, 2015 | 96.67 | 8.99 | 90.77 | 3.33 | 20.99 |
ETPS [22] | CVPR, 2015 | 96.58 | 8.17 | 95.38 | 3.42 | 13.09 |
SEAL [24] | CVPR, 2018 | 97.06 | 8.68 | 90.08 | 2.94 | 23.14 |
SSN [13] | ECCV, 2018 | 96.95 | 12.03 | 85.97 | 3.05 | 38.09 |
SCN [14] | CVPR, 2020 | 96.92 | 12.54 | 84.83 | 3.08 | 39.05 |
AINet [15] | ICCV, 2021 | 97.07 | 12.74 | 86.90 | 2.93 | 34.71 |
ESNet [16] | TCSVT, 2023 | 97.21 | 13.08 | 87.90 | 2.79 | 37.28 |
CDS [26] | AAAI, 2024 | 97.25 | 12.80 | 88.17 | 2.74 | 35.59 |
MAS-Net (ours) | - | 97.29 | 13.14 | 88.96 | 2.71 | 34.55 |
Method | Params (M) | Iterations | Time (ms) | ASA (%) | Device |
---|---|---|---|---|---|
SLIC [11] | - | Yes | 105 | 96.23 | CPU |
LSC [20] | - | Yes | 96 | 96.52 | CPU |
ETPS [22] | - | Yes | 299 | 96.50 | CPU |
SEAL [24] | 0.89 † | Yes | 1691 | 97.06 | CPU and GPU |
SSN [13] | 0.21 † | Yes | 2316 | 97.10 | GPU |
SCN [14] | 2.27 | No | 19 | 96.92 | GPU |
AINet [15] | 5.59 | No | 41 | 96.95 | GPU |
ESNet [16] | 0.44 | No | 7 | 97.21 | GPU |
CDS [26] | 0.40 | No | 7 | 97.25 | GPU |
MAS-Net (ours) | 6.58 | No | 46 | 97.29 | GPU |
Module | ASA (%) ↑ | BP (%) ↑ | BR (%) ↑ | UE (%) ↓ | CO (%) ↑ |
---|---|---|---|---|---|
Ori ResBlock | 97.25 | 11.98 | 88.18 | 2.75 | 36.88 |
PAR | 97.29 | 13.14 | 88.96 | 2.71 | 34.55 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, G.; Wei, C.; Jia, X.; Li, Y.; Chang, W. MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation. Symmetry 2024, 16, 1000. https://doi.org/10.3390/sym16081000
Yan G, Wei C, Jia X, Li Y, Chang W. MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation. Symmetry. 2024; 16(8):1000. https://doi.org/10.3390/sym16081000
Chicago/Turabian StyleYan, Guanghui, Chenzhen Wei, Xiaohong Jia, Yonghui Li, and Wenwen Chang. 2024. "MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation" Symmetry 16, no. 8: 1000. https://doi.org/10.3390/sym16081000
APA StyleYan, G., Wei, C., Jia, X., Li, Y., & Chang, W. (2024). MAS-Net: Multi-Attention Hybrid Network for Superpixel Segmentation. Symmetry, 16(8), 1000. https://doi.org/10.3390/sym16081000