Abstract
The self-distillation methods can transfer the knowledge within the network itself to enhance the generalization ability of the network. However, due to the lack of spatially refined knowledge representations, current self-distillation methods can hardly be directly applied to object segmentation tasks. In this paper, we propose a novel self-distillation framework via pyramid knowledge representation and transfer for the object segmentation task. Firstly, a lightweight inference network is built to perform pixel-wise prediction rapidly. Secondly, a novel self-distillation method is proposed. To derive refined pixel-wise knowledge representations, the auxiliary self-distillation network via multi-level pyramid representation branches is built and appended to the inference network. A synergy distillation loss, which utilizes the top-down and consistency knowledge transfer paths, is presented to force more discriminative knowledge to be distilled into the inference network. Consequently, the performance of the inference network is improved. Experimental results on five datasets of object segmentation demonstrate that the proposed self-distillation method helps our inference network perform better segmentation effectiveness and efficiency than nine recent object segmentation network. Furthermore, the proposed self-distillation method outperforms typical self-distillation methods. The source code is publicly available at https://github.com/xfflyer/SKDforSegmentation.
Similar content being viewed by others
Data availibility statement
All data generated or analysed during this study are included in this published article. The datasets used in this paper are public.
References
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Huang, G., Liu, Z., Van, Der, Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700–4708 (2017)
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: British Machine Vision Conference (BMVC) 2016, British Machine Vision Association (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)
Furlanello, T., Lipton, Z., Tschannen, M., et al.: Born again neural networks. In: International Conference on Machine Learning (ICML), PMLR, pp. 1607–1616 (2018)
Yang, C., Xie, L., Su, C., et al.: Snapshot distillation: teacher-student optimization in one generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2859–2868 (2019)
Ji, M., Shin, S., Hwang, S., et al.: Refine myself by teaching myself: Feature refinement via self-knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,664–10,673 (2021)
Hou, Y., Ma, Z., Liu, C., et al.: Learning lightweight lane detection cnns by self attention distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1013–1021 (2019)
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3713–3722 (2019)
Sun, D., Yao, A., Zhou, A., Zhao, H.: Deeply-supervised knowledge synergy. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6997–7006 (2019)
Li, D., Chen, Q.: Dynamic hierarchical mimicking towards consistent optimization objectives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7642–7651 (2020)
Zhang, L., Song, J., Bao, C., Ma, K.: Self-distillation: towards efficient and compact neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 44(8), 569–582 (2022)
Cui, Y., Yang, L.J., Liu, D., et al.: dynamic proposals for efficient object detection. arXiv:2104.13298 (2021)
Yuan, L., Tay, F.E., Li, G., et al.: Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3903–3911 (2020)
Yun, S., Park, J., Lee, K., et al.: Regularizing class-wise prediction via self-knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13876–13885 (2020)
Ge, Y., Choi, C. L., Zhang, X., et al.: Self-distillation with batch knowledge ensembling improves imagenet classification. arXiv:2012.09816(2021)
Xu, T.B., Liu, C.L.: Data-distortion guided self-distillation for deep neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 5565–5572 (2019)
Lee, H., Hwang, S.J., Shin, J.: Self-supervised label augmentation via input transformations. In: International Conference on Machine Learning (ICML), PMLR, pp. 5714–5724 (2020)
Bau, D., Zhou, B., Khosla, A., et al.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6541–6549 (2017)
Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3917–3926 (2019)
Wei, J., Wang, S., Huang, Q.: F3net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 12321–12328 (2020)
Gao, S.H., Tan, Y.Q., Cheng, M.M., Lu, C., Chen, Y., Yan, S.: Highly efficient salient object detection with 100k parameters. In: European Conference on Computer Vision (ECCV), pp. 702–721 (2020)
Wang, L., Chen, R., Zhu, L., et al.: Deep subregion network for salient object detection. IEEE Trans. Circuits Syst. Video Technol. 31(2), 728–741 (2020)
Zhao, X., Pang, Y., Zhang, L., et al.: Suppress and balance: a simple gated network for salient object detection. In: European Conference on Computer Vision (ECCV). Springer, pp. 35–51 (2020)
Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)
He, D., Xie, C.: Semantic image segmentation algorithm in a deep learning computer network. Multim. Syst. 1–13 (2020)
Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2777–2787 (2020)
Zheng, Y.F., Zhang, X.W., Wang, F., Cao, T.Y., Sun, M., Wang, X.B.: Detection of people with camouflage pattern via dense deconvolution network. IEEE Signal Process. Lett. 26(1), 29–33 (2018)
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 603-612(2019)
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Expectation-maximization attention networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Residual dense network for image super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Residual dense network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)
Liu, D.F., Cui, Y.M., Yan, L.Q.: Densernet: weakly supervised visual localization using multiscale feature aggregation. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 35, vol. 7, pp. 6101–6109 (2021)
Huang, Z.H., Li, W., Xia, X.G., et al.: A novel nonlocal-aware pyramid and multiscale multitask refinement detector for object detection in remote sensing images. IEEE Trans. Pattern Anal. Mach. Intell. 1–20 (2021)
Cheng, M.M., Mitra, N.J., Huang, X., et al.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2019)
Zhu, W., Liang, S., Wei, Y., et al.: Saliency optimization from robust background detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2814–2821 (2014)
Li, N., Sun, B., Yu, J.: A weighted sparse coding framework for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5216–5223 (2015)
Romero, A., Ballas, N., Kahou, S.E., et al.: Fitnets: hints for thin deep nets. In: International Conference on Learning Representations (ICLR) (2015)
Komodakis, N., Zagoruyko, S.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International Conference on Learning Representations (ICLR)(2017)
Lee, S.H., Kim, D.H., Song, B.C.: Self-supervised knowledge distillation using singular value decomposition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–350 (2018)
Zhang, Y., Xiang, T., Hospedales, T.M., et al.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4320–4328 (2018)
Chen, D., Mei, J.P., Wang, C., et al.: Online knowledge distillation with diverse peers. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 3430–3437 (2020)
Allen-Zhu, Z., Li, Y.: Towards understanding ensemble, knowledge distillation and self-distillation in deep learning. arXiv:2012.09816 (2020)
Rebuffi, S.A., Fong, R., Ji, X., et al.: There and back again: Revisiting back-propagation saliency methods. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8839–8848 (2020)
Woo, S., Park, J., Lee, J.Y., et al.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.-H.: Saliency detection via graph based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3166–3173 (2013)
Fan, D.P., Cheng, M.M., Liu, J.J., Gao, S.H., Hou, Q., Borji, A.: Salient objects in clutter: bringing salient object detection to the foreground. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 186–202 (2018)
Cheng, M.M., Mitra, N.J., Huang, X., et al.: Salientshape: group saliency in image collections. Vis Comput 30(4), 443–453 (2014)
Achanta, R., Hemami, S., Estrada, F., et al.: Frequency tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1597–1604 (2009)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp 248–255 (2009)
Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3907–3916 (2019)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)
Wu, Y.H., Liu, Y., Zhang, L.: Regularized densely-connected pyramid network for salient instance segmentation. IEEE Trans. Image Process. 30, 3897–3907 (2021)
Funding
This research was supported by the Natural Science Foundation of China (Nos. 61801512, 62071484) and Natural Science Foundation of Jiangsu Province (No. BK20180080).
Author information
Authors and Affiliations
Contributions
YZ contributed to the model designing and implementing, and paper writing. MS contributed to the model designing and paper writing. XW contributed to the data analysis and paper writing. TC contributed to the model designing and data analysis. XZ contributed to model designing and data analysis. LX contributed to the data analysis. ZF contributed to the model implementing and paper writing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethical approval
Not applicable.
Consent for publication
Not applicable.
Consent to participate
Not applicable.
Code availability
The source code will be publicly available at https://github.com/xfflyer/SKDforSegmentation.
Additional information
Communicated by Y. Kong.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zheng, Y., Sun, M., Wang, X. et al. Self-distillation object segmentation via pyramid knowledge representation and transfer. Multimedia Systems 29, 2615–2631 (2023). https://doi.org/10.1007/s00530-023-01121-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01121-x