TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation | SpringerLink
Skip to main content

TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (MICCAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12901))

Abstract

Medical image segmentation - the prerequisite of numerous clinical needs - has been significantly prospered by recent advances in convolutional neural networks (CNNs). However, it exhibits general limitations on modeling explicit long-range relation, and existing cures, resorting to building deep encoders along with aggressive downsampling operations, leads to redundant deepened networks and loss of localized details. Hence, the segmentation task awaits a better solution to improve the efficiency of modeling global contexts while maintaining a strong grasp of low-level details. In this paper, we propose a novel parallel-in-branch architecture, TransFuse, to address this challenge. TransFuse combines Transformers and CNNs in a parallel style, where both global dependency and low-level spatial details can be efficiently captured in a much shallower manner. Besides, a novel fusion technique - BiFusion module is created to efficiently fuse the multi-level features from both branches. Extensive experiments demonstrate that TransFuse achieves the newest state-of-the-art results on both 2D and 3D medical image sets including polyp, skin lesion, hip, and prostate segmentation, with significant parameter decrease and inference speed improvement.

Y. Zhang and H. Liu—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 14871
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 18589
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Another similar dataset ISIC2018 was not used because of the missing test set annotation, which makes fair comparison between existing works can be hardly achieved.

  2. 2.

    All data are from different patients and with ethics approval, which consists of 267 patients of Avascular Necrosis, 182 patients of Osteoarthritis, 71 patients of Femur Neck Fracture, 33 patients of Pelvis Fracture, 26 patients of Developmental Dysplasia of the Hip and 62 patients of other dieases.

  3. 3.

    https://github.com/MIC-DKFZ/nnUNet.

References

  1. Al-Masni, M.A., Al-Antari, M.A., et al.: Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks. Computer methods and programs in biomedicine (2018)

    Google Scholar 

  2. Bernal, J., Sánchez, F.J., et al.: Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics (2015)

    Google Scholar 

  3. Bi, L., Kim, J., et al.: Step-wise integration of deep class-specific learning for dermoscopic image segmentation. Pattern recognition (2019)

    Google Scholar 

  4. Chen, J., Lu, Y., et al.: Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)

  5. Codella, N.C., Gutman, D., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018) (2018)

    Google Scholar 

  6. Deng, J., Dong, W., et al.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  7. Dosovitskiy, A., Beyer, L., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  8. Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (2020)

    Google Scholar 

  9. Hesamian, M.H., Jia, W., He, X., Kennedy, P.: Deep learning techniques for medical image segmentation: achievements and challenges. Journal of digital imaging (2019)

    Google Scholar 

  10. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  11. Huang, C.H., Wu, H.Y., Lin, Y.L.: Hardnet-mseg: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172 (2021)

  12. Isensee, F., Jäger, P.F., et al.: Automated design of deep learning methods for biomedical image segmentation. arXiv preprint arXiv:1904.08128 (2019)

  13. Jha, D., Smedsrud, P.H., Riegler, M.A., Johansen, D., De Lange, T., Halvorsen, P., Johansen, H.D.: Resunet++: an advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM) (2019)

    Google Scholar 

  14. Jha, D., Smedsrud, P.H., et al.: Kvasir-seg: a segmented polyp dataset. In: International Conference on Multimedia Modeling (2020)

    Google Scholar 

  15. Li, H., He, X., et al.: Dense deconvolutional network for skin lesion segmentation. IEEE J. Biomed. Health Inform. 23, 527–537 (2018)

    Google Scholar 

  16. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural. Inf. Process. Syst. 32, 8026–8037 (2019)

    Google Scholar 

  17. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  18. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (2015)

    Google Scholar 

  19. Sarker, M.M.K., Rashwan, H.A., et al.: Slsdeep: skin lesion segmentation based on dilated residual and pyramid pooling networks. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (2018)

    Google Scholar 

  20. Schlemper, J., Oktay, O., et al.: Attention gated networks: learning to leverage salient regions in medical images. Medical image analysis (2019)

    Google Scholar 

  21. Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int. J. Comput. Assisted Radiol. Surg. 9, 283–293(2014)

    Google Scholar 

  22. Simpson, A.L., Antonelli, M., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint arXiv:1902.09063 (2019)

  23. Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)

  24. Tajbakhsh, N., et al.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans. Med. Imaging 35, 630–644 (2015)

    Google Scholar 

  25. Touvron, H., Cord, M., et al.: Training data-efficient image transformers & distillation through attention. arXiv preprint arXiv:2012.12877 (2020)

  26. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)

  27. Vázquez, D., Bernal, J., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthcare Eng. (2017)

    Google Scholar 

  28. Wang, J., Sun, K., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43, 3349–3364 (2020)

    Google Scholar 

  29. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  30. Woo, S., Park, J., et al.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  31. Yuan, Y., Lo, Y.C.: Improving dermoscopic image segmentation with enhanced convolutional-deconvolutional networks. IEEE J. Biomed. Health Inf. 23, 519–526 (2017)

    Article  Google Scholar 

  32. Zheng, S., Lu, J., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. arXiv preprint arXiv:2012.15840 (2020)

  33. Zhou, Z., et al.: Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39, 1856–1867 (2019)

    Google Scholar 

Download references

Acknowledgement

We gratefully thank Weijun Wang, MD, Zhefeng Chen, MD, Chuan He, MD, Zhengyu Xu, Huaikun Xu for serving as our medical advisors on hip segmentation project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiye Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Liu, H., Hu, Q. (2021). TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12901. Springer, Cham. https://doi.org/10.1007/978-3-030-87193-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87193-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87192-5

  • Online ISBN: 978-3-030-87193-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics