Abstract
Semantic segmentation plays an important role in many fields because of its powerful ability to classify each pixel efficiently and accurately, but it relies on a large amount of manual annotations. In many cases, the annotations are very scarce and expensive, such as in medical image segmentation. To address this problem, researchers have been increasingly concerned about building efficient deep learning algorithms using rough label information in the past few years, with weakly supervised semantic segmentation method being one of them. Currently, most weakly supervised semantic segmentation methods rely on prototype learning to obtain the correlation between pixels; when the images of different categories are similar or indistinguishable, the extracted prototype has no representativeness to guide the training of model. Inspired by metric learning, we construct the pixel-level pairwise samples and propose a new self-supervised contrastive loss based on them, which makes full use of the class activation maps to reduce the intra-class difference and increase the inter-class difference; we also propose a novel prototype loss by a superpixel-guided clustering method to mine the valuable information in the image, which gathers the similar feature vectors to obtain the prototypes more accurately. The comparative experiments are carried out on PASCAL VOC 2012 and MS COCO 2014, the segmentation mIoU on the test set of PASCAL VOC 2012 has reached 69.5%, and the mIoU on the test set of MS COCO 2014 has reached 40.6%. The experimental results demonstrate our method achieves new state-of-the-art performance, which verifies the effectiveness and feasibility of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The authors declare that all other data supporting the findings of this study are available within the article.
References
Li Y, Shi T, Zhang Y, et al. Learning deep semantic segmentation network under multiple weakly-supervised constraints for cross-domain remote sensing image semantic segmentation. ISPRS Journal of Photogrammetry and Remote Sensing. 2021;175:20–33.
Yao Y, Chen T, Xie GS, et al. Non-salient region object mining for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. pp 2623–2632.
Jiang L, Shi S, Tian Z, et al. Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. pp 6423–6432.
Asgari Taghanaki S, Abhishek K, Cohen JP, et al. Deep semantic segmentation of natural and medical images: a review. Artificial Intelligence Review. 2021;54(1):137–78.
Oh Y, Kim B, Ham B. Background-aware pooling and noise-aware loss for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. pp 6913–6922.
Khoreva A, Benenson R, Hosang J, et al. Simple does it: weakly supervised instance and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. pp 876–885.
Lin D, Dai J, Jia J, et al. Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. pp 3159–3167.
Tang M, Perazzi F, Djelouah A, et al. On regularized losses for weakly-supervised CNN segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018. pp 507–522.
Vernaza P, Chandraker M. Learning random-walk label propagation for weakly-supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. pp 7158–7166.
Minaee S, Boykov YY, Porikli F, et al. Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell.; 2021.
Kolesnikov A, Lampert CH. Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: European conference on computer vision, Springer; 2016. pp 695–711.
Bearman A, Russakovsky O, Ferrari V, et al. What’s the point: semantic segmentation with point supervision. In: European conference on computer vision, Springer; 2016. pp 549–565.
Sun K, Shi H, Zhang Z, et al. Ecs-net: improving weakly supervised semantic segmentation by using connections between class activation maps. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. pp 7283–7292.
Chen Q, Yang L, Lai JH, et al. Self-supervised image-specific prototype exploration for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp 4288–4298.
Huang Z, Wang X, Wang J, et al. Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. pp 7014–7023.
Shimoda W, Yanai K. Self-supervised difference detection for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2019. pp 5208–5217.
Wang X, You S, Li X, et al. Weakly-supervised semantic segmentation by iteratively mining common object features. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. pp 1354–1362.
Ahn J, Cho S, Kwak S. Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019. pp 2209–2218.
Ahn J, Kwak S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. pp 4981–4990.
Chang YT, Wang Q, Hung WC, et al. Weakly-supervised semantic segmentation via sub-category exploration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. pp 8991–9000.
Wang Y, Zhang J, Kan M, et al. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. pp 12275–12284.
Wu T, Huang J, Gao G, et al. Embedded discriminative attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. pp 16765–16774.
Lee J, Kim E, Yoon S. Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. pp 4071–4080.
Kweon H, Yoon SH, Kim H, et al. Unlocking the potential of ordinary classifier: class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. pp 6994–7003.
Ru L, Du B, Zhan Y, et al. Weakly-supervised semantic segmentation with visual words learning and hybrid pooling. International Journal of Computer Vision. 2022;130(4):1127–44.
Chen Z, Wang T, Wu X, et al. Class re-activation maps for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp 969–978.
Du Y, Fu Z, Liu Q, et al. Weakly supervised semantic segmentation by pixel-to-prototype contrast. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp 4320–4329.
Zhou T, Wang W, Konukoglu E, et al. Rethinking semantic segmentation: a prototype view. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. pp 2582–2593.
Xie J, Xiang J, Chen J, et al. Contrastive learning of class-agnostic activation map for weakly supervised object localization and semantic segmentation. arXiv preprint. 2022. arXiv:2203.13505
Atito S, Awais M, Kittler J. Sit: Self-supervised vision transformer. arXiv preprint. 2021. arXiv:2104.03602
Achanta R, Shaji A, Smith K, et al. Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012;34(11):2274–82. https://doi.org/10.1109/TPAMI.2012.120.
Jampani V, Sun D, Liu MY, et al. Superpixel sampling networks. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018. pp 352–368.
Suzuki T. Superpixel segmentation via convolutional neural networks with regularized information maximization. In: ICASSP 2020–2020 IEEE International Conference on Acoustics. IEEE: Speech and Signal Processing (ICASSP); 2020. p. 2573–7.
Wan J, Liu Y, Wei D, et al. Super-BPD: super boundary-to-pixel direction for fast image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. pp 9253–9262.
Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. pp 2921–2929.
Chen T, Kornblith S, Norouzi M, et al. A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR; 2020. pp 1597–1607.
Li G, Jampani V, Sevilla-Lara L, et al. Adaptive prototype learning and allocation for few-shot segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. pp 8334–8343.
Hou Q, Jiang P, Wei Y, et al. Self-erasing network for integral object attention. Adv Neural Inf Proces Syst. 2018;31.
Lee J, Kim E, Lee S, et al. Ficklenet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019. pp 5267–5276.
Jiang PT, Hou Q, Cao Y, et al. Integral object mining via online attention accumulation. In: Proceedings of the IEEE/CVF international conference on computer vision; 2019. pp 2070–2079.
Li X, Zhou T, Li J, et al. Group-wise semantic mining for weakly supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence; 2021. pp 1984–1992.
Fan J, Zhang Z, Song C, et al. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020. pp 4283–4292.
Chen L, Wu W, Fu C, et al. Weakly supervised semantic segmentation with boundary exploration. In: European Conference on Computer Vision, Springer; 2020. pp 347–362.
Zhang D, Zhang H, Tang J, et al. Causal intervention for weakly-supervised semantic segmentation. Advances in Neural Information Processing Systems. 2020;33:655–66.
Sun G, Wang W, Dai J, et al. Mining cross-image semantics for weakly supervised semantic segmentation. In: European conference on computer vision, Springer; 2020. pp 347–365.
Liu Y, Wu YH, Wen PS, et al. Leveraging instance-, image-and dataset-level information for weakly supervised instance segmentation. IEEE Trans Pattern Anal Mach Intell.; 2020.
Zhang B, Xiao J, Jiao J, et al. Affinity attention graph neural network for weakly supervised semantic segmentation. IEEE Trans Pattern Anal Mach Intell.; 2021.
Xu R, Wang C, Sun J, et al. Self correspondence distillation for end-to-end weakly-supervised semantic segmentation. arXiv preprint. 2023. arXiv:2302.13765
Lee S, Lee M, Lee J, et al. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021. pp 5495–5505.
Xu L, Ouyang W, Bennamoun M, et al. Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. pp 6984–6993.
Wang X, Liu S, Ma H, et al. Weakly-supervised semantic segmentation by iterative affinity learning. International Journal of Computer Vision. 2020;128(6):1736–49.
Chen LC, Papandreou G, Kokkinos I, et al. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;40(4):834–48.
Funding
This work was funded by the National Natural Science Foundation of China under Grant 51774219 and Key R &D Projects in Hubei Province under grant 2020BAB098.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This study is non-human subject research and is exempt from ethical approval by the corresponding author’s university.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xie, L., Li, W. & Zhao, Y. Pairwise-Pixel Self-Supervised and Superpixel-Guided Prototype Contrastive Loss for Weakly Supervised Semantic Segmentation. Cogn Comput 16, 936–948 (2024). https://doi.org/10.1007/s12559-024-10277-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-024-10277-1