Abstract
Deep learning models, e.g. supervised Encoder-Decoder style networks, exhibit promising performance in medical image segmentation, but come with a high labelling cost. We propose TriSegNet, a semi-supervised semantic segmentation framework. It uses triple-view feature learning on a limited amount of labelled data and a large amount of unlabeled data. The triple-view architecture consists of three pixel-level classifiers and a low-level shared-weight learning module. The model is first initialized with labelled data. Label processing, including data perturbation, confidence label voting and unconfident label detection for annotation, enables the model to train on labelled and unlabeled data simultaneously. The confidence of each model gets improved through the other two views of the feature learning. This process is repeated until each model reaches the same confidence level as its counterparts. This strategy enables triple-view learning of generic medical image datasets. Bespoke overlap-based and boundary-based loss functions are tailored to the different stages of the training. The segmentation results are evaluated on four publicly available benchmark datasets including Ultrasound, CT, MRI, and Histology images. Repeated experiments demonstrate the effectiveness of the proposed network compared against other semi-supervised algorithms, across a large set of evaluation measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). www.tensorflow.org/
Bernard, O., et al.: Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Trans. Med. Imaging 37(11), 2514–2525 (2018)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100 (1998)
Chaurasia, A., Culurciello, E.: Linknet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing, pp. 1–4. IEEE (2017)
Chen, D.D., et al.: Tri-net for semi-supervised deep learning. In: International Joint Conferences on Artificial Intelligence (2018)
Chen, L.-C., et al.: Naive-student: leveraging semi-supervised learning in video sequences for urban scene segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 695–714. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_40
Chen, X., et al.: Semi-supervised semantic segmentation with cross pseudo supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2613–2622 (2021)
Chen, X., et al.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11632–11640 (2019)
Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655. PMLR (2014)
Gamper, J., Alemi Koohbanani, N., Benet, K., Khuram, A., Rajpoot, N.: PanNuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification. In: Reyes-Aldasoro, C.C., Janowczyk, A., Veta, M., Bankhead, P., Sirinukunwattana, K. (eds.) ECDP 2019. LNCS, vol. 11435, pp. 11–19. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23937-4_2
Huang, J., et al.: O2u-net: a simple noisy label detection approach for deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3326–3334 (2019)
Kaggle: Ultrasound nerve segmentation. www.kaggle.com/c/ultrasound-nerve-segmentation
Ke, Z., Qiu, D., Li, K., Yan, Q., Lau, R.W.H.: Guided collaborative training for pixel-wise semi-supervised learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 429–445. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_26
Kim, S.W., et al.: Parallel feature pyramid network for object detection. In: Proceedings of the European Conference on Computer Vision, pp. 234–250 (2018)
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Luo, X., et al.: Efficient semi-supervised gross target volume of nasopharyngeal carcinoma segmentation via uncertainty rectified pyramid consistency. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12902, pp. 318–329. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87196-3_30
Qiao, S., et al.: Deep co-training for semi-supervised image recognition. In: Proceedings of the European Conference on Computer Vision, pp. 135–152 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1195–1204 (2017)
Verma, V., et al.: Interpolation consistency training for semi-supervised learning. In: International Joint Conference on Artificial Intelligence, pp. 3635–3641 (2019)
Vu, T.H., et al.: Advent: adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2517–2526 (2019)
Wang, Z., Zhang, Z., Voiculescu, I.: RAR-U-Net: a residual encoder to attention decoder by residual connections framework for spine segmentation under noisy labels. In: Proceedings of the IEEE International Conference on Image Processing. IEEE (2021)
Wang, Z., et al.: Computationally-efficient vision transformer for medical image semantic segmentation via dual pseudo-label supervision. In: IEEE International Conference on Image Processing (ICIP) (2022)
Wang, Z., et al.: An uncertainty-aware transformer for MRI cardiac semantic segmentation via mean teachers. In: Yang, G., Aviles-Rivero, A., Roberts, M., Schönlieb, C.B., et al. (eds.) Medical Image Understanding and Analysis. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-12053-4_37
Xia, Y., et al.: 3D semi-supervised learning with uncertainty-aware multi-view co-training. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3646–3655 (2020)
Yao, J., Burns, J.E., Munoz, H., Summers, R.M.: Detection of vertebral body fractures based on cortical shell unwrapping. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012. LNCS, vol. 7512, pp. 509–516. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33454-2_63
Yeghiazaryan, V., Voiculescu, I.D.: Family of boundary overlap metrics for the evaluation of medical image segmentation. SPIE J. Med. Imaging 5(1), 015006 (2018)
You, X., et al.: Segmentation of retinal blood vessels using the radial projection and semi-supervised approach. Pattern Recogn. 44(10–11), 2314–2324 (2011)
Yu, L., Wang, S., Li, X., Fu, C.-W., Heng, P.-A.: Uncertainty-aware self-ensembling model for semi-supervised 3d left atrium segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 605–613. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_67
Zhang, Y., Yang, L., Chen, J., Fredericksen, M., Hughes, D.P., Chen, D.Z.: Deep adversarial networks for biomedical image segmentation utilizing unannotated images. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 408–416. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_47
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Algorithm of TriSegNet
The training of TriSegNet consists of four stages which is briefly illustrated in Algorithm 1. The code of TriSegNet will be publicly availableFootnote 2.
B The CNN Architecture of Multi-view Learning
To properly encourage the differences of the three views of feature learning on dense prediction, not only the data feed and initialization of parameters, but three different advanced CNN are proposed in TriSegNet. We utilize three different techniques for CNN i.e. skip connection, efficiently passing feature information through residual learning, and multi-scale feature learning. The parameters of three classifiers are briefly illustrated in Table 4 and the source code has been released onlineFootnote 3.
C Evaluation Methods, Qualitative, and Quantitative Results
Table 2 reports the TriSegNet performance direct comparison with other algorithms with several strict and novel quantitative evaluation metrics to which the boundaries of the machine segmentation(MS) match those of the ground truth(GT), using the Directed Boundary Dice relative to GT (DBD\(_G\)), Directed Boundary Dice relative to MS (DBD\(_M\)) and Symmetric Boundary Dice (SBD).
In a von Neumann neighbourhood \(N_x\) of each pixel x on the boundary \(\partial G\) of the ground truth,
where Dice is \(Dice(N_x) = \frac{2 | G(N_x) \cap M(N_y)|}{| G(N_x)| + | M(N_y)|} \). The symmetric average is being brought down by DBD\(_G\) when the latter features isolated areas of false negative labels. These measures penalise mislabelled areas in the machine segmentation.
Some of example qualitative results on MRI Cardiac test set are briefly sketched in Fig. 4. Eight images are selected from MRI test set where the first row illustrates raw images. The rest of them illustrate the MS by each semi-supervised algorithm against GT where yellow, green, red, and black represent true positive, false negative, false positive and true negative at pixel level. The proposed method shows fewer false positive and false negative pixels, and significantly low HD as well, because the TriSegNet is beneficial with different views of high-level pixel-level classifier and proposed mixed boundary- and overlap-based loss function.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z., Voiculescu, I. (2022). Triple-View Feature Learning for Medical Image Segmentation. In: Xu, X., Li, X., Mahapatra, D., Cheng, L., Petitjean, C., Fu, H. (eds) Resource-Efficient Medical Image Analysis. REMIA 2022. Lecture Notes in Computer Science, vol 13543. Springer, Cham. https://doi.org/10.1007/978-3-031-16876-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-16876-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16875-8
Online ISBN: 978-3-031-16876-5
eBook Packages: Computer ScienceComputer Science (R0)