Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Bakiya, K.; Savarimuthu, Nickolas

doi:10.1007/s11760-024-03451-3

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Original Paper
Published: 31 August 2024

Volume 18, pages 8061–8076, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

K. Bakiya¹ &
Nickolas Savarimuthu¹

260 Accesses
1 Citation
Explore all metrics

Abstract

Minimally invasive surgeries reduce blood loss and quicker recovery times for patients compared to open surgeries. Equipped with high-definition 3D cameras, robotic surgical systems offer an enhanced visual perspective, empowering surgeons to make informed decisions and reduce damage to healthy tissues around the surgical site. At this juncture, emphasizing the importance of precise localization of surgical instruments in robotic-assisted surgeries is of utmost significance. The primary intention of this paper is to address the challenge of achieving accurate localization of surgical tools within registered tissue to establish clear marginal boundaries and prevent tissue tearing. So, surgeons can reduce the risk of human error with extremely precise and steady movements. In this paper, we proposed a binary segmentation model complemented with a customized U-Net architecture to ensure precise inference in surgical instrument segmentation. Additionally, the accuracy was enhanced through the incorporation of Red channel amplification and the Otsu thresholding approach. The effectiveness of our proposed Modified U-Net is validated through experiments conducted on both the publicly available surgical instrument segmentation dataset, MICCAI 2017 EndoVis Challenge (EndoVis2017), and Neuro surgical tools (NST) dataset consisting of both brain and spine tumor removal procedures. The results presented compelling evidence of the superior performance attained through the integration of the modified U-Net with Red channel amplification and Otsu thresholding with remarkable outcomes including 99.5% accuracy and a Dice score of 0.9846 in binary segmentation, 99.41% accuracy, and a Dice score of 0.9751 in part segmentation, and 99.35% accuracy and a Dice score of 0.9726 in type segmentation. We presented both quantitative and qualitative inference results for NST dataset samples. These results were obtained by leveraging a transfer learning approach, where the model was trained on the EndoVis2017 dataset using twelve training iterations. Inference results demonstrate that combining red channel amplification with Otsu thresholding significantly improves surgical instrument segmentation in the NST dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Enhancing surgical instrument segmentation: integrating vision transformer insights with adapter

Article Open access 08 May 2024

Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion

A parallel network utilizing local features and global representations for segmentation of surgical instruments

Article 10 June 2022

Data availability

Endovis 2017 Dataset is available at https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org NST Dataset is available at https://medicis.univ-rennes1.fr/software.

Code availability

The Code that support the research findings of this study are available upon request from the authors. NST Dataset is available at https://medicis.univ-rennes1.fr/software.

References

Liu, J., Guo, X., Yuan, Y.: Graph-based surgical instrument adaptive segmentation via domain-common knowledge. IEEE Trans. Med. Imaging 41(3), 715–726 (2022). https://doi.org/10.1109/TMI.2021.3121138
Article Google Scholar
Hussain, S.M., Brunetti, A., Lucarelli, G., Memeo, R., Bevilacqua, V., Buongiorno, D.: Deep learning based image processing for robot assisted surgery: a systematic literature survey. IEEE Access 10, 122627–122657 (2022). https://doi.org/10.1109/ACCESS.2022.3223704
Article Google Scholar
Qiu, L., Ren, H.: Endoscope navigation with SLAM-based registration to computed tomography for transoral surgery. Int. J. Intell. Robot Appl. 4, 252–263 (2020). https://doi.org/10.1007/s41315-020-00127-2
Article Google Scholar
Ren, H., Li, C., Qiu, L., Lim, C Ming: 38-ACTORS: adaptive and compliant transoral robotic surgery with flexible manipulators and intelligent guidance. In: Abedin-Nasab, H.H. (ed.) Handbook of robotic and image-guided surgery. Elsevier, New Jersey (2020)
Google Scholar
Srivastava, A.K., Singhvi, S., Qiu, L., et al.: Image guided navigation utilizing intra-operative 3D surface scanning to mitigate morphological deformation of surface anatomy. J. Med. Biol. Eng. 39, 932–943 (2019). https://doi.org/10.1007/s40846-019-00475-w
Article Google Scholar
Gao, H., et al.: SAVAnet: surgical action-driven visual attention network for autonomous endoscope control. IEEE Trans. Autom. Sci. Eng. 20(4), 2655–2667 (2023). https://doi.org/10.1109/TASE.2022.3203631
Article Google Scholar
Dhamija, T., Gupta, A., Gupta, S., et al.: Semantic segmentation in medical images through transfused convolution and transformer networks. Appl. Intell. 53, 1132–1148 (2023). https://doi.org/10.1007/s10489-022-03642-w
Article Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Sandler M et al.: (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: proceedings of the IEEE conference on computer vision and pattern recognition; piscataway: IEEE, pp. 4510–4520. (2018)
Liu, Z et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), Montreal, QC, Canada, pp. 9992–10002, (2021) https://doi.org/10.1109/ICCV48922.2021.00986.
Huang, B., et al.: Simultaneous depth estimation and surgical tool segmentation in laparoscopic images. IEEE Trans. Med. Robot. Bionic. 4(2), 335–338 (2022). https://doi.org/10.1109/TMRB.2022.3170215
Article Google Scholar
Yang, L., Wang, H., Gu, Y., Bian, G., Liu, Y., Yu, H.: TMA-Net: A Transformer-based multi-scale attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bionic. 5(2), 323–334 (2023). https://doi.org/10.1109/TMRB.2023.3269856
Article Google Scholar
Lou, A., Tawfik, K., Yao, X., Liu, Z., Noble, J.: Min-max similarity: a contrastive semi-supervised deep learning network for surgical tools segmentation. IEEE Trans. Med. Imaging 42(10), 2832–2841 (2023). https://doi.org/10.1109/TMI.2023.3266137
Article Google Scholar
Kong, X., Jin, Y., Dou, Q., et al.: Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation. Int J CARS 16, 1607–1614 (2021). https://doi.org/10.1007/s11548-021-02438-6
Article Google Scholar
Allan, M., Shvets, A., Kurmann, T., Zhang, Z., Duggal, R., Su, Y. et al.: 2017 robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 (2019)
Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., Jannin, P.: Detecting surgical tools by modelling local appearance and global shape. IEEE Trans. Med. Imaging 34(12), 2603–2617 (2015). https://doi.org/10.1109/TMI.2015.2450831
Article Google Scholar
Yao, R., Lin, G., Xia, S., Zhao, J., Zhou, Y.: Video object segmentation and tracking: a survey. ACM Trans. Intell. Syst. Technol. 11(4), 47 (2020). https://doi.org/10.1145/3391743
Article Google Scholar
Zhou, T., Porikli, F., Crandall, D.J., Van Gool, L., Wang, W.: A survey on deep learning technique for video segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7099–7122 (2023). https://doi.org/10.1109/TPAMI.2022.3225573
Article Google Scholar
Qiu, L., Li, C., Ren, H.: Real-time surgical instrument tracking in robot-assisted surgery using multi-domain convolutional neural network. Healthc. Technol. Lett. 6, 159–164 (2019). https://doi.org/10.1049/htl.2019.0068
Article Google Scholar
Nadeau, C., Ren, H., Krupa, A., Dupont, P.: Intensity-based visual servoing for instrument and tissue tracking in 3D ultrasound volumes. IEEE Trans. Autom. Sci. Eng. 12(1), 367–371 (2015). https://doi.org/10.1109/TASE.2014.2343652
Article Google Scholar
Du, X., et al.: Articulated multi-instrument 2-D pose estimation using fully convolutional networks. IEEE Trans. Med. Imaging 37(5), 1276–1287 (2018). https://doi.org/10.1109/TMI.2017.2787672
Article Google Scholar
Allan, M., Ourselin, S., Hawkes, D.J., Kelly, J.D., Stoyanov, D.: 3-D Pose estimation of articulated instruments in robotic minimally invasive surgery. IEEE Trans. Med. Imaging 37(5), 1204–1213 (2018). https://doi.org/10.1109/TMI.2018.2794439
Article Google Scholar
Islam, M., Atputharuban, D.A., Ramesh, R., Ren, H.: Real-Time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot. Autom. Lett. 4(2), 2188–2195 (2019). https://doi.org/10.1109/LRA.2019.2900854
Article Google Scholar
Yu, L., Wang, P., Yu, X., et al.: A holistically-nested U-Net: surgical instrument segmentation based on convolutional neural network. J. Digit. Imaging 33, 341–347 (2020). https://doi.org/10.1007/s10278-019-00277-1
Article Google Scholar
Bouarfa, L., Akman, O., Schneider, A., Jonker, P.P., Dankelman, J.: In-vivo real-time tracking of surgical instruments in endoscopic video. Minim. Invasive Ther. Allied Technol. 21(3), 129–134 (2012). https://doi.org/10.3109/13645706.2011.580764
Article Google Scholar
Qiu, L., Ren, H.: RSegNet: a joint learning framework for deformable registration and segmentation. IEEE Trans. Autom. Sci. Eng. 19(3), 2499–2513 (2022). https://doi.org/10.1109/TASE.2021.3087868
Article Google Scholar
Sahu, M., Mukhopadhyay, A., Zachow, S.: Simulation-to-real domain adaptation with teacher–student learning for endoscopic instrument segmentation. Int J CARS 16, 849–859 (2021). https://doi.org/10.1007/s11548-021-02383-4
Article Google Scholar
Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bion. 4(3), 696–707 (2022). https://doi.org/10.1109/TMRB.2022.3193420
Article Google Scholar
Wang, L., Zhou, C., Cao, Y., Zhao, R., Xu, K.: Vision-based markerless tracking for continuum surgical instruments in robot-assisted minimally invasive surgery. IEEE Robot. Autom. Lett. 8(11), 7202–7209 (2023). https://doi.org/10.1109/LRA.2023.3315229
Article Google Scholar
Zhao, Z., Chen, Z., Voros, S., Cheng, X.: Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Comput. Assist. Surg. 24(sup1), 20–29 (2019). https://doi.org/10.1080/24699322.2018.1560097
Article Google Scholar
Lin, S., Qin, F., Peng, H., Bly, R.A., Moe, K.S., Hannaford, B.: Multi-frame feature aggregation for real-time instrument segmentation in endoscopic video. IEEE Robot. Autom. Lett. 6(4), 6773–6780 (2021). https://doi.org/10.1109/LRA.2021.3096156
Article Google Scholar
Wang, X., et al.: PaI-Net: a modified u-net of reducing semantic gap for surgical instrument segmentation. IET Image Process. 15, 2959–2969 (2021). https://doi.org/10.1049/ipr2.12283
Article Google Scholar
Yang, L., Gu, Y., Bian, G., Liu, Y.: TMF-Net: a transformer-based multiscale fusion network for surgical instrument segmentation from endoscopic images. IEEE Trans. Instrum. Meas. 72, 1–15 (2023). https://doi.org/10.1109/TIM.2022.3225922
Article Google Scholar
Huang, K., Chitrakar, D., Jiang, W., Yung, I., Yun-Hsuan, S.: Surgical tool segmentation with pose-informed morphological polar transform of endoscopic images. J. Med. Robot. Res. 07(0n203), 2241003 (2022). https://doi.org/10.1142/S2424905X22410033
Article Google Scholar
Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310
Article Google Scholar
Cerón, J.C.Á., Ruiz, G.O., Chang, L., Ali, S.: Real-time instance segmentation of surgical instruments using attention and multi-scale feature fusion. Med. Image Anal. 81, 102569 (2022). https://doi.org/10.1016/j.media.2022.102569
Article Google Scholar
Ni, Z.-L., Bian, G.-B., Li, Z., Zhou, X.-H., Li, R.-Q., Hou, Z.-G.: Space squeeze reasoning and low-rank bilinear feature fusion for surgical image segmentation. IEEE J. Biomed. Health Inform. 26(7), 3209–3217 (2022). https://doi.org/10.1109/JBHI.2022.3154925
Article Google Scholar
Yang, L., Yuge, G., Bian, G., Liu, Y.: An attention-guided network for surgical instrument segmentation from endoscopic images. Comput. Biol. Med. 151, 106216 (2022). https://doi.org/10.1016/j.compbiomed.2022.106216
Article Google Scholar
Yang, L., Yuge, G., Bian, G., Liu, Y.: MAF-Net: a multi-scale attention fusion network for automatic surgical instrument segmentation. Biomed. Signal Proc. Control 85, 104912 (2023). https://doi.org/10.1016/j.bspc.2023.104912
Article Google Scholar
Yang, L., Wang, H., Bian, G., Liu, Y.: HCTA-Net: a hybrid CNN-transformer attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bion. 5(4), 929–944 (2023). https://doi.org/10.1109/TMRB.2023.3315479
Article Google Scholar
Nyo, M.T., Mebarek-Oudina, F., Hlaing, S.S., Khan, N.A.: Otsu’s thresholding technique for MRI image brain tumor segmentation. Multimed. Tools Appl. 81, 43837–43849 (2022). https://doi.org/10.1007/s11042-022-13215-1
Article Google Scholar
Merzban, M.H., Elbayoumi, M.: Efficient solution of Otsu multilevel image thresholding: A comparative study. Expert Syst. Appl. 116, 299–309 (2019). https://doi.org/10.1016/j.eswa.2018.09.008
Article Google Scholar
https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In Proc. Int. Conf. Med. image comput. comput.-assist. intervent. (MICCAI). Cham, Switzerland: Springer, pp. 234–241. (2015)
https://medicis.univ-rennes1.fr/software
Iglovikov, V., Shvets, A.: TernausNet: U-Net with VGG11encoder pre-trained on ImageNet for image segmentation. (2018). arXiv:1801.05746
Oktay, O. et al.: Attention U-Net: Learning where to look for the pancreas. (2018). arXiv:1804.03999
Hasan, S. M. K., Linte, C. A.: U-NetPlus: A modified encoder–decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 7205–7211. (2019)
Y. Jin, Y., Cheng, K., Dou, Q., Heng, P-A.: Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 440–448. (2019)
Ni, Z-L., Bian, G-B., Xie, X-L., Hou, Z-G., Zhou, X-H., Zhou, Y-J.: RASNet: Segmentation for tracking surgical instruments in surgical videos using refined attention segmentation network. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 5735–5738. (2019)
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imag. 39(6), 1856–1867 (2020)
Article Google Scholar
Liu, D., et al.: Unsupervised surgical instrument segmentation via anchor generation and semantic diffusion. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 657–667. (2020)
Cao, H. et al.: Swin-UNet: UNet-like pure transformer for medical image segmentation. (2021). arXiv: 2105.05537.
Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310
Article Google Scholar
Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bionics 4(3), 696–707 (2022)
Article Google Scholar
Wu, H., Chen, S., Chen, G., Wang, W., Lei, B., Wen, Z.: FAT-Net: feature adaptive transformers for automated skin lesion segmentation. Med. Image Anal. 76, 102327 (2022). https://doi.org/10.1016/j.media.2021.102327.
Article Google Scholar
Wang, Z., Li, Z., Yu, X., Jia, Z., Xu, X., Schuller, B.W.: Cross-scene semantic segmentation for medical surgical instruments using structural similarity-based partial activation networks. IEEE Trans. Med. Robot. Bion. 6(2), 399–409 (2024). https://doi.org/10.1109/TMRB.2024.3359303
Article Google Scholar

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this Manuscript.

Author information

Authors and Affiliations

Department of Computer Applications, National Institute of Technology, Tiruchirappalli, Tamilnadu, India
K. Bakiya & Nickolas Savarimuthu

Authors

K. Bakiya
View author publications
You can also search for this author in PubMed Google Scholar
Nickolas Savarimuthu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Bakiya. K and S. Nickolas contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Bakiya. K and S. Nickolas. The first draft of the manuscript was written by Bakiya. K and S. Nickolas commented on previous versions of the manuscript. Bakiya. K and S. Nickolas read and approved the final manuscript.

Corresponding author

Correspondence to K. Bakiya.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethics approval and Consent to participate

This study, “Transfer Learning for Surgical Instrument Segmentation,” used publicly available images from Endovis 2017 and Neuro Surgical Tools (NST) databases, which did not require additional ethics approval or individual consent due to their intended research purpose.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bakiya, K., Savarimuthu, N. Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification. SIViP 18, 8061–8076 (2024). https://doi.org/10.1007/s11760-024-03451-3

Download citation

Received: 19 April 2024
Revised: 15 July 2024
Accepted: 19 July 2024
Published: 31 August 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s11760-024-03451-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing surgical instrument segmentation: integrating vision transformer insights with adapter

Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion

A parallel network utilizing local features and global representations for segmentation of surgical instruments

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval and Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing surgical instrument segmentation: integrating vision transformer insights with adapter

Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion

A parallel network utilizing local features and global representations for segmentation of surgical instruments

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval and Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation