Abstract
In recent years, medical image segmentation methods based on deep learning have been of great importance for disease diagnosis and treatment planning in clinical medicine. U-Net and a series of networks derived from it have led the research trend in medical image segmentation. In this paper, an end-to-end lightweight MLP-based medical image segmentation network G-UNeXt is proposed to address the problems of high computational complexity, large number of model parameters and slow inference in medical image segmentation networks. Firstly, this paper proposes a new skip connection method Ghost path. It reduces the semantic gap between features while combining different levels of features, and exploits the redundancy of features, which helps to improve the segmentation ability of the model for details. Secondly, this paper designs a cheap and effective G-S block, which uses low-cost linear operations to mine potential ghost features outside of intrinsic features. The G-S block reduce the number of parameters and computational complexity compared to traditional convolution. In addition, it can also adaptively calibrate the channel feature response, enhancing the characterization capability of the network and bringing some performance improvement at a lower computational cost. Finally, we build the lightweight MLP-based network G-UNeXt used Ghost path and G-S block for real-time segmentation of medical images. The results tested on the benchmark medical image segmentation datasets BUSI and ISIC2018 show that G-UNeXt reduces the parameters by 33% and the computational complexity by 23.7% compared with UNeXt. In addition, G-UNeXt also obtains faster inference speed and higher segmentation accuracy.











Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer, Cham (2015)
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., et al.: Unet++: a nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11. Springer, Cham (2018)
Huang, H., Lin, L., Tong, R., et al.: Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE (2020)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., et al.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 424–432. Springer, Cham (2016)
Diakogiannis, F.I., Waldner, F., Caccetta, P., et al.: ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogramm. Remote Sens. 162, 94–114 (2020)
Li, R., Zheng, S., Duan, C., et al.: Multistage attention ResU-Net for semantic segmentation of fine-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2021)
Valanarasu, J.M.J., Sindagi, V.A., Hacihaliloglu, I., et al.: Kiu-net: towards accurate segmentation of biomedical images using over-complete representations. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 363–373. Springer, Cham (2020)
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565-571. IEEE (2016)
Chen, J., Lu, Y., Yu, Q., et al.: Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., et al.: Medical transformer: gated axial-attention for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 36–46. Springer, Cham (2021)
Wang, W., Chen, C., Ding, M., et al.: Transbts: multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 109–119. Springer, Cham (2021)
Cao, H., Wang, Y., Chen, J., et al.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)
Hatamizadeh, A., Tang, Y., Nath, V., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Zhang, X., Zhou, X., Lin, M., et al.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Chollet F. Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Han, K., Wang, Y., Tian, Q., et al.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
Valanarasu, J.M.J., Patel, V.M.: UNeXt: MLP-based rapid medical image segmentation network. arXiv preprint arXiv:2203.04967 (2022)
Liu, Z., Han, K., Wang, Z., et al.: Automatic liver segmentation from abdominal CT volumes using improved convolution neural networks. Multimed. Syst. 27(1), 111–124 (2021)
Wang, D., Hu, G., Lyu, C.: Frnet: an end-to-end feature refinement neural network for medical image segmentation. Vis. Comput. 37(5), 1101–1112 (2021)
Li, X., Huang, H., Zhao, H., et al.: Learning a convolutional neural network for propagation-based stereo image segmentation. Vis. Comput. 36(1), 39–52 (2020)
Desai, M., Shah, M.: An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and convolutional neural network (CNN). Clin. eHealth 4, 1–11 (2021)
Shorfuzzaman, M.: An explainable stacked ensemble of deep learning models for improved melanoma skin cancer detection. Multimed. Syst. 28(4), 1309–1323 (2022)
Zhu, L., Wang, S., Zhao, Z., et al.: CED-Net: contextual encoder–decoder network for 3D face reconstruction. Multimed. Syst. 28(5), 1713–1722 (2022)
Cheng, Z., Qu, A., He, X.: Contour-aware semantic segmentation network with spatial attention mechanism for medical image. Vis. Comput. 38(3), 749–762 (2022)
Xie, B., Milam, G., Ning, B., et al.: DXM-TransFuse U-net: dual cross-modal transformer fusion u-net for automated nerve identification. Comput. Med. Imaging Graph. 99, 102090 (2022)
Chen, H., Liu, Y., Shi, Z.: FPF-Net: feature propagation and fusion based on attention mechanism for pancreas segmentation. Multimed. Syst. 29(2), 525–538 (2022)
Tian, X., Jin, Y., Tang, X.: Local-global transformer neural network for temporal action segmentation. Multimed. Syst. 29(2), 615–626 (2022)
Bappy, D.M., Hong, A., Choi, E., et al.: Automated three-dimensional vessel reconstruction based on deep segmentation and bi-plane angiographic projections. Comput. Med. Imaging Graph. 92, 101956 (2021)
He, D., Xie, C.: Semantic image segmentation algorithm in a deep learning computer network. Multimed. Syst. 28(6), 2065–2077 (2020)
Feng, P., Tang, Z.: A survey of visual neural networks: current trends, challenges and opportunities. Multimed. Syst. 29, 673–724 (2022)
Jin, Y., Hu, Y., Jiang, Z., et al.: Polyp segmentation with convolutional MLP. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02630-y
Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., et al.: Mlp-mixer: an all-mlp architecture for vision. Adv. Neural. Inf. Process. Syst. 34, 24261–24272 (2021)
Touvron, H., Bojanowski, P., Caron, M., et al.: Resmlp: Feedforward networks for image classification with data-efficient training. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3206148
Lian, D., Yu, Z., Sun, X., et al.: As-mlp: an axial shifted mlp architecture for vision. arXiv preprint arXiv:2107.08391 (2021)
Yu T, Li X, Cai Y, et al. S2-mlp: spatial-shift mlp architecture for vision. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 297–306 (2022)
Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Ibtehaz, N., Rahman, M.S.: MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87 (2020)
Li, C., Tan, Y., Chen, W., et al.: ANU-Net: attention-based nested U-Net to exploit full resolution features for medical image segmentation. Comput. Graph. 90, 11–20 (2020)
Kushnure, D.T., Talbar, S.N.: MS-UNet: a multi-scale UNet with feature recalibration approach for automatic liver and tumor segmentation in CT images. Comput. Med. Imaging Graph. 89, 101885 (2021)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Gholami, A., Kwon, K., Wu, B., et al.: Squeezenext: hardware-aware neural network design. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1638–1647 (2018)
Ma, N., Zhang, X., Zheng, H.T., et al.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Sandler, M., Howard, A., Zhu, M., et al.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Howard, A., Sandler, M., Chu, G., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
He, X., Zhao, K., Chu, X.: AutoML: a survey of the state-of-the-art. Knowl.-Based Syst. 212, 106622 (2021)
Guyon, I., Sun-Hosoya, L., Boullé, M., et al.: Analysis of the automl challenge series. In: Hutter, F., et al. (eds.) Automated Machine Learning: Methods, Systems, Challenges, pp. 177–219. Springer International Publishing, Cham (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. pmlr, pp. 448–456 (2015)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Ho, J., Kalchbrenner, N., Weissenborn, D., et al.: Axial attention in multidimensional transformers. arXiv preprint arXiv:1912.12180 (2019)
Codella, N.C.F., Gutman, D., Celebi, M.E., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 168–172. IEEE (2018)
Al-Dhabyani, W., Gomaa, M., Khaled, H., et al.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Acknowledgements
This research is partially supported by Natural Science Foundation of Hebei Province (F2022201013, F2022201055) and Startup Foundation for Advanced Talents of Hebei University (No.521100221003).
Author information
Authors and Affiliations
Contributions
Xin Zhang and Xiaotian Cao contributed to the conception of the study; Xiaotian Cao and Lei Wan performed the experiment; Xin Zhang and Jun Wang contributed significantly to analysis and manuscript preparation; Xiaotian Cao and Lei Wan performed the data analyses and wrote the manuscript; Jun Wang, Xiaotian Cao and Lei Wan helped perform the analysis with constructive discussions. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Communicated by R. Huang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, X., Cao, X., Wang, J. et al. G-UNeXt: a lightweight MLP-based network for reducing semantic gap in medical image segmentation. Multimedia Systems 29, 3431–3446 (2023). https://doi.org/10.1007/s00530-023-01173-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01173-z