G-UNeXt: a lightweight MLP-based network for reducing semantic gap in medical image segmentation | Multimedia Systems Skip to main content

Advertisement

Log in

G-UNeXt: a lightweight MLP-based network for reducing semantic gap in medical image segmentation

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

In recent years, medical image segmentation methods based on deep learning have been of great importance for disease diagnosis and treatment planning in clinical medicine. U-Net and a series of networks derived from it have led the research trend in medical image segmentation. In this paper, an end-to-end lightweight MLP-based medical image segmentation network G-UNeXt is proposed to address the problems of high computational complexity, large number of model parameters and slow inference in medical image segmentation networks. Firstly, this paper proposes a new skip connection method Ghost path. It reduces the semantic gap between features while combining different levels of features, and exploits the redundancy of features, which helps to improve the segmentation ability of the model for details. Secondly, this paper designs a cheap and effective G-S block, which uses low-cost linear operations to mine potential ghost features outside of intrinsic features. The G-S block reduce the number of parameters and computational complexity compared to traditional convolution. In addition, it can also adaptively calibrate the channel feature response, enhancing the characterization capability of the network and bringing some performance improvement at a lower computational cost. Finally, we build the lightweight MLP-based network G-UNeXt used Ghost path and G-S block for real-time segmentation of medical images. The results tested on the benchmark medical image segmentation datasets BUSI and ISIC2018 show that G-UNeXt reduces the parameters by 33% and the computational complexity by 23.7% compared with UNeXt. In addition, G-UNeXt also obtains faster inference speed and higher segmentation accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  2. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  3. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer, Cham (2015)

  4. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., et al.: Unet++: a nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11. Springer, Cham (2018)

  5. Huang, H., Lin, L., Tong, R., et al.: Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE (2020)

  6. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., et al.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 424–432. Springer, Cham (2016)

  7. Diakogiannis, F.I., Waldner, F., Caccetta, P., et al.: ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogramm. Remote Sens. 162, 94–114 (2020)

    Article  Google Scholar 

  8. Li, R., Zheng, S., Duan, C., et al.: Multistage attention ResU-Net for semantic segmentation of fine-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2021)

    Google Scholar 

  9. Valanarasu, J.M.J., Sindagi, V.A., Hacihaliloglu, I., et al.: Kiu-net: towards accurate segmentation of biomedical images using over-complete representations. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 363–373. Springer, Cham (2020)

  10. Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565-571. IEEE (2016)

  11. Chen, J., Lu, Y., Yu, Q., et al.: Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)

  12. Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., et al.: Medical transformer: gated axial-attention for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 36–46. Springer, Cham (2021)

  13. Wang, W., Chen, C., Ding, M., et al.: Transbts: multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 109–119. Springer, Cham (2021)

  14. Cao, H., Wang, Y., Chen, J., et al.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)

  15. Hatamizadeh, A., Tang, Y., Nath, V., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)

  16. Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  17. Zhang, X., Zhou, X., Lin, M., et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

  18. Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

  19. Han, K., Wang, Y., Tian, Q., et al.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)

  20. Valanarasu, J.M.J., Patel, V.M.: UNeXt: MLP-based rapid medical image segmentation network. arXiv preprint arXiv:2203.04967 (2022)

  21. Liu, Z., Han, K., Wang, Z., et al.: Automatic liver segmentation from abdominal CT volumes using improved convolution neural networks. Multimed. Syst. 27(1), 111–124 (2021)

    Article  Google Scholar 

  22. Wang, D., Hu, G., Lyu, C.: Frnet: an end-to-end feature refinement neural network for medical image segmentation. Vis. Comput. 37(5), 1101–1112 (2021)

    Article  Google Scholar 

  23. Li, X., Huang, H., Zhao, H., et al.: Learning a convolutional neural network for propagation-based stereo image segmentation. Vis. Comput. 36(1), 39–52 (2020)

    Article  Google Scholar 

  24. Desai, M., Shah, M.: An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and convolutional neural network (CNN). Clin. eHealth 4, 1–11 (2021)

    Article  Google Scholar 

  25. Shorfuzzaman, M.: An explainable stacked ensemble of deep learning models for improved melanoma skin cancer detection. Multimed. Syst. 28(4), 1309–1323 (2022)

    Article  Google Scholar 

  26. Zhu, L., Wang, S., Zhao, Z., et al.: CED-Net: contextual encoder–decoder network for 3D face reconstruction. Multimed. Syst. 28(5), 1713–1722 (2022)

    Article  Google Scholar 

  27. Cheng, Z., Qu, A., He, X.: Contour-aware semantic segmentation network with spatial attention mechanism for medical image. Vis. Comput. 38(3), 749–762 (2022)

    Article  Google Scholar 

  28. Xie, B., Milam, G., Ning, B., et al.: DXM-TransFuse U-net: dual cross-modal transformer fusion u-net for automated nerve identification. Comput. Med. Imaging Graph. 99, 102090 (2022)

    Article  Google Scholar 

  29. Chen, H., Liu, Y., Shi, Z.: FPF-Net: feature propagation and fusion based on attention mechanism for pancreas segmentation. Multimed. Syst. 29(2), 525–538 (2022)

    Article  Google Scholar 

  30. Tian, X., Jin, Y., Tang, X.: Local-global transformer neural network for temporal action segmentation. Multimed. Syst. 29(2), 615–626 (2022)

    Article  Google Scholar 

  31. Bappy, D.M., Hong, A., Choi, E., et al.: Automated three-dimensional vessel reconstruction based on deep segmentation and bi-plane angiographic projections. Comput. Med. Imaging Graph. 92, 101956 (2021)

    Article  Google Scholar 

  32. He, D., Xie, C.: Semantic image segmentation algorithm in a deep learning computer network. Multimed. Syst. 28(6), 2065–2077 (2020)

    Article  MathSciNet  Google Scholar 

  33. Feng, P., Tang, Z.: A survey of visual neural networks: current trends, challenges and opportunities. Multimed. Syst. 29, 673–724 (2022)

    Google Scholar 

  34. Jin, Y., Hu, Y., Jiang, Z., et al.: Polyp segmentation with convolutional MLP. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02630-y

    Article  Google Scholar 

  35. Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., et al.: Mlp-mixer: an all-mlp architecture for vision. Adv. Neural. Inf. Process. Syst. 34, 24261–24272 (2021)

    Google Scholar 

  36. Touvron, H., Bojanowski, P., Caron, M., et al.: Resmlp: Feedforward networks for image classification with data-efficient training. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3206148

    Article  Google Scholar 

  37. Lian, D., Yu, Z., Sun, X., et al.: As-mlp: an axial shifted mlp architecture for vision. arXiv preprint arXiv:2107.08391 (2021)

  38. Yu T, Li X, Cai Y, et al. S2-mlp: spatial-shift mlp architecture for vision. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 297–306 (2022)

  39. Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

  40. Ibtehaz, N., Rahman, M.S.: MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87 (2020)

    Article  Google Scholar 

  41. Li, C., Tan, Y., Chen, W., et al.: ANU-Net: attention-based nested U-Net to exploit full resolution features for medical image segmentation. Comput. Graph. 90, 11–20 (2020)

    Article  Google Scholar 

  42. Kushnure, D.T., Talbar, S.N.: MS-UNet: a multi-scale UNet with feature recalibration approach for automatic liver and tumor segmentation in CT images. Comput. Med. Imaging Graph. 89, 101885 (2021)

    Article  Google Scholar 

  43. Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

  44. Gholami, A., Kwon, K., Wu, B., et al.: Squeezenext: hardware-aware neural network design. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1638–1647 (2018)

  45. Ma, N., Zhang, X., Zheng, H.T., et al.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)

  46. Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

  47. Sandler, M., Howard, A., Zhu, M., et al.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  48. Howard, A., Sandler, M., Chu, G., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)

  49. He, X., Zhao, K., Chu, X.: AutoML: a survey of the state-of-the-art. Knowl.-Based Syst. 212, 106622 (2021)

    Article  Google Scholar 

  50. Guyon, I., Sun-Hosoya, L., Boullé, M., et al.: Analysis of the automl challenge series. In: Hutter, F., et al. (eds.) Automated Machine Learning: Methods, Systems, Challenges, pp. 177–219. Springer International Publishing, Cham (2019)

    Chapter  Google Scholar 

  51. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  52. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. pmlr, pp. 448–456 (2015)

  53. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)

  54. Ho, J., Kalchbrenner, N., Weissenborn, D., et al.: Axial attention in multidimensional transformers. arXiv preprint arXiv:1912.12180 (2019)

  55. Codella, N.C.F., Gutman, D., Celebi, M.E., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 168–172. IEEE (2018)

  56. Al-Dhabyani, W., Gomaa, M., Khaled, H., et al.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020)

    Article  Google Scholar 

  57. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Acknowledgements

This research is partially supported by Natural Science Foundation of Hebei Province (F2022201013, F2022201055) and Startup Foundation for Advanced Talents of Hebei University (No.521100221003).

Author information

Authors and Affiliations

Authors

Contributions

Xin Zhang and Xiaotian Cao contributed to the conception of the study; Xiaotian Cao and Lei Wan performed the experiment; Xin Zhang and Jun Wang contributed significantly to analysis and manuscript preparation; Xiaotian Cao and Lei Wan performed the data analyses and wrote the manuscript; Jun Wang, Xiaotian Cao and Lei Wan helped perform the analysis with constructive discussions. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jun Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by R. Huang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Cao, X., Wang, J. et al. G-UNeXt: a lightweight MLP-based network for reducing semantic gap in medical image segmentation. Multimedia Systems 29, 3431–3446 (2023). https://doi.org/10.1007/s00530-023-01173-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01173-z

Keywords

Navigation