Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification | Signal, Image and Video Processing Skip to main content

Advertisement

Log in

Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Minimally invasive surgeries reduce blood loss and quicker recovery times for patients compared to open surgeries. Equipped with high-definition 3D cameras, robotic surgical systems offer an enhanced visual perspective, empowering surgeons to make informed decisions and reduce damage to healthy tissues around the surgical site. At this juncture, emphasizing the importance of precise localization of surgical instruments in robotic-assisted surgeries is of utmost significance. The primary intention of this paper is to address the challenge of achieving accurate localization of surgical tools within registered tissue to establish clear marginal boundaries and prevent tissue tearing. So, surgeons can reduce the risk of human error with extremely precise and steady movements. In this paper, we proposed a binary segmentation model complemented with a customized U-Net architecture to ensure precise inference in surgical instrument segmentation. Additionally, the accuracy was enhanced through the incorporation of Red channel amplification and the Otsu thresholding approach. The effectiveness of our proposed Modified U-Net is validated through experiments conducted on both the publicly available surgical instrument segmentation dataset, MICCAI 2017 EndoVis Challenge (EndoVis2017), and Neuro surgical tools (NST) dataset consisting of both brain and spine tumor removal procedures. The results presented compelling evidence of the superior performance attained through the integration of the modified U-Net with Red channel amplification and Otsu thresholding with remarkable outcomes including 99.5% accuracy and a Dice score of 0.9846 in binary segmentation, 99.41% accuracy, and a Dice score of 0.9751 in part segmentation, and 99.35% accuracy and a Dice score of 0.9726 in type segmentation. We presented both quantitative and qualitative inference results for NST dataset samples. These results were obtained by leveraging a transfer learning approach, where the model was trained on the EndoVis2017 dataset using twelve training iterations. Inference results demonstrate that combining red channel amplification with Otsu thresholding significantly improves surgical instrument segmentation in the NST dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Algorithm 3
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

Endovis 2017 Dataset is available at https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org NST Dataset is available at https://medicis.univ-rennes1.fr/software.

Code availability

The Code that support the research findings of this study are available upon request from the authors. NST Dataset is available at https://medicis.univ-rennes1.fr/software.

References

  1. Liu, J., Guo, X., Yuan, Y.: Graph-based surgical instrument adaptive segmentation via domain-common knowledge. IEEE Trans. Med. Imaging 41(3), 715–726 (2022). https://doi.org/10.1109/TMI.2021.3121138

    Article  Google Scholar 

  2. Hussain, S.M., Brunetti, A., Lucarelli, G., Memeo, R., Bevilacqua, V., Buongiorno, D.: Deep learning based image processing for robot assisted surgery: a systematic literature survey. IEEE Access 10, 122627–122657 (2022). https://doi.org/10.1109/ACCESS.2022.3223704

    Article  Google Scholar 

  3. Qiu, L., Ren, H.: Endoscope navigation with SLAM-based registration to computed tomography for transoral surgery. Int. J. Intell. Robot Appl. 4, 252–263 (2020). https://doi.org/10.1007/s41315-020-00127-2

    Article  Google Scholar 

  4. Ren, H., Li, C., Qiu, L., Lim, C Ming: 38-ACTORS: adaptive and compliant transoral robotic surgery with flexible manipulators and intelligent guidance. In: Abedin-Nasab, H.H. (ed.) Handbook of robotic and image-guided surgery. Elsevier, New Jersey (2020)

    Google Scholar 

  5. Srivastava, A.K., Singhvi, S., Qiu, L., et al.: Image guided navigation utilizing intra-operative 3D surface scanning to mitigate morphological deformation of surface anatomy. J. Med. Biol. Eng. 39, 932–943 (2019). https://doi.org/10.1007/s40846-019-00475-w

    Article  Google Scholar 

  6. Gao, H., et al.: SAVAnet: surgical action-driven visual attention network for autonomous endoscope control. IEEE Trans. Autom. Sci. Eng. 20(4), 2655–2667 (2023). https://doi.org/10.1109/TASE.2022.3203631

    Article  Google Scholar 

  7. Dhamija, T., Gupta, A., Gupta, S., et al.: Semantic segmentation in medical images through transfused convolution and transformer networks. Appl. Intell. 53, 1132–1148 (2023). https://doi.org/10.1007/s10489-022-03642-w

    Article  Google Scholar 

  8. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  9. Sandler M et al.: (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: proceedings of the IEEE conference on computer vision and pattern recognition; piscataway: IEEE, pp. 4510–4520. (2018)

  10. Liu, Z et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), Montreal, QC, Canada, pp. 9992–10002, (2021) https://doi.org/10.1109/ICCV48922.2021.00986.

  11. Huang, B., et al.: Simultaneous depth estimation and surgical tool segmentation in laparoscopic images. IEEE Trans. Med. Robot. Bionic. 4(2), 335–338 (2022). https://doi.org/10.1109/TMRB.2022.3170215

    Article  Google Scholar 

  12. Yang, L., Wang, H., Gu, Y., Bian, G., Liu, Y., Yu, H.: TMA-Net: A Transformer-based multi-scale attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bionic. 5(2), 323–334 (2023). https://doi.org/10.1109/TMRB.2023.3269856

    Article  Google Scholar 

  13. Lou, A., Tawfik, K., Yao, X., Liu, Z., Noble, J.: Min-max similarity: a contrastive semi-supervised deep learning network for surgical tools segmentation. IEEE Trans. Med. Imaging 42(10), 2832–2841 (2023). https://doi.org/10.1109/TMI.2023.3266137

    Article  Google Scholar 

  14. Kong, X., Jin, Y., Dou, Q., et al.: Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation. Int J CARS 16, 1607–1614 (2021). https://doi.org/10.1007/s11548-021-02438-6

    Article  Google Scholar 

  15. Allan, M., Shvets, A., Kurmann, T., Zhang, Z., Duggal, R., Su, Y. et al.: 2017 robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 (2019)

  16. Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., Jannin, P.: Detecting surgical tools by modelling local appearance and global shape. IEEE Trans. Med. Imaging 34(12), 2603–2617 (2015). https://doi.org/10.1109/TMI.2015.2450831

    Article  Google Scholar 

  17. Yao, R., Lin, G., Xia, S., Zhao, J., Zhou, Y.: Video object segmentation and tracking: a survey. ACM Trans. Intell. Syst. Technol. 11(4), 47 (2020). https://doi.org/10.1145/3391743

    Article  Google Scholar 

  18. Zhou, T., Porikli, F., Crandall, D.J., Van Gool, L., Wang, W.: A survey on deep learning technique for video segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7099–7122 (2023). https://doi.org/10.1109/TPAMI.2022.3225573

    Article  Google Scholar 

  19. Qiu, L., Li, C., Ren, H.: Real-time surgical instrument tracking in robot-assisted surgery using multi-domain convolutional neural network. Healthc. Technol. Lett. 6, 159–164 (2019). https://doi.org/10.1049/htl.2019.0068

    Article  Google Scholar 

  20. Nadeau, C., Ren, H., Krupa, A., Dupont, P.: Intensity-based visual servoing for instrument and tissue tracking in 3D ultrasound volumes. IEEE Trans. Autom. Sci. Eng. 12(1), 367–371 (2015). https://doi.org/10.1109/TASE.2014.2343652

    Article  Google Scholar 

  21. Du, X., et al.: Articulated multi-instrument 2-D pose estimation using fully convolutional networks. IEEE Trans. Med. Imaging 37(5), 1276–1287 (2018). https://doi.org/10.1109/TMI.2017.2787672

    Article  Google Scholar 

  22. Allan, M., Ourselin, S., Hawkes, D.J., Kelly, J.D., Stoyanov, D.: 3-D Pose estimation of articulated instruments in robotic minimally invasive surgery. IEEE Trans. Med. Imaging 37(5), 1204–1213 (2018). https://doi.org/10.1109/TMI.2018.2794439

    Article  Google Scholar 

  23. Islam, M., Atputharuban, D.A., Ramesh, R., Ren, H.: Real-Time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot. Autom. Lett. 4(2), 2188–2195 (2019). https://doi.org/10.1109/LRA.2019.2900854

    Article  Google Scholar 

  24. Yu, L., Wang, P., Yu, X., et al.: A holistically-nested U-Net: surgical instrument segmentation based on convolutional neural network. J. Digit. Imaging 33, 341–347 (2020). https://doi.org/10.1007/s10278-019-00277-1

    Article  Google Scholar 

  25. Bouarfa, L., Akman, O., Schneider, A., Jonker, P.P., Dankelman, J.: In-vivo real-time tracking of surgical instruments in endoscopic video. Minim. Invasive Ther. Allied Technol. 21(3), 129–134 (2012). https://doi.org/10.3109/13645706.2011.580764

    Article  Google Scholar 

  26. Qiu, L., Ren, H.: RSegNet: a joint learning framework for deformable registration and segmentation. IEEE Trans. Autom. Sci. Eng. 19(3), 2499–2513 (2022). https://doi.org/10.1109/TASE.2021.3087868

    Article  Google Scholar 

  27. Sahu, M., Mukhopadhyay, A., Zachow, S.: Simulation-to-real domain adaptation with teacher–student learning for endoscopic instrument segmentation. Int J CARS 16, 849–859 (2021). https://doi.org/10.1007/s11548-021-02383-4

    Article  Google Scholar 

  28. Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bion. 4(3), 696–707 (2022). https://doi.org/10.1109/TMRB.2022.3193420

    Article  Google Scholar 

  29. Wang, L., Zhou, C., Cao, Y., Zhao, R., Xu, K.: Vision-based markerless tracking for continuum surgical instruments in robot-assisted minimally invasive surgery. IEEE Robot. Autom. Lett. 8(11), 7202–7209 (2023). https://doi.org/10.1109/LRA.2023.3315229

    Article  Google Scholar 

  30. Zhao, Z., Chen, Z., Voros, S., Cheng, X.: Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Comput. Assist. Surg. 24(sup1), 20–29 (2019). https://doi.org/10.1080/24699322.2018.1560097

    Article  Google Scholar 

  31. Lin, S., Qin, F., Peng, H., Bly, R.A., Moe, K.S., Hannaford, B.: Multi-frame feature aggregation for real-time instrument segmentation in endoscopic video. IEEE Robot. Autom. Lett. 6(4), 6773–6780 (2021). https://doi.org/10.1109/LRA.2021.3096156

    Article  Google Scholar 

  32. Wang, X., et al.: PaI-Net: a modified u-net of reducing semantic gap for surgical instrument segmentation. IET Image Process. 15, 2959–2969 (2021). https://doi.org/10.1049/ipr2.12283

    Article  Google Scholar 

  33. Yang, L., Gu, Y., Bian, G., Liu, Y.: TMF-Net: a transformer-based multiscale fusion network for surgical instrument segmentation from endoscopic images. IEEE Trans. Instrum. Meas. 72, 1–15 (2023). https://doi.org/10.1109/TIM.2022.3225922

    Article  Google Scholar 

  34. Huang, K., Chitrakar, D., Jiang, W., Yung, I., Yun-Hsuan, S.: Surgical tool segmentation with pose-informed morphological polar transform of endoscopic images. J. Med. Robot. Res. 07(0n203), 2241003 (2022). https://doi.org/10.1142/S2424905X22410033

    Article  Google Scholar 

  35. Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310

    Article  Google Scholar 

  36. Cerón, J.C.Á., Ruiz, G.O., Chang, L., Ali, S.: Real-time instance segmentation of surgical instruments using attention and multi-scale feature fusion. Med. Image Anal. 81, 102569 (2022). https://doi.org/10.1016/j.media.2022.102569

    Article  Google Scholar 

  37. Ni, Z.-L., Bian, G.-B., Li, Z., Zhou, X.-H., Li, R.-Q., Hou, Z.-G.: Space squeeze reasoning and low-rank bilinear feature fusion for surgical image segmentation. IEEE J. Biomed. Health Inform. 26(7), 3209–3217 (2022). https://doi.org/10.1109/JBHI.2022.3154925

    Article  Google Scholar 

  38. Yang, L., Yuge, G., Bian, G., Liu, Y.: An attention-guided network for surgical instrument segmentation from endoscopic images. Comput. Biol. Med. 151, 106216 (2022). https://doi.org/10.1016/j.compbiomed.2022.106216

    Article  Google Scholar 

  39. Yang, L., Yuge, G., Bian, G., Liu, Y.: MAF-Net: a multi-scale attention fusion network for automatic surgical instrument segmentation. Biomed. Signal Proc. Control 85, 104912 (2023). https://doi.org/10.1016/j.bspc.2023.104912

    Article  Google Scholar 

  40. Yang, L., Wang, H., Bian, G., Liu, Y.: HCTA-Net: a hybrid CNN-transformer attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bion. 5(4), 929–944 (2023). https://doi.org/10.1109/TMRB.2023.3315479

    Article  Google Scholar 

  41. Nyo, M.T., Mebarek-Oudina, F., Hlaing, S.S., Khan, N.A.: Otsu’s thresholding technique for MRI image brain tumor segmentation. Multimed. Tools Appl. 81, 43837–43849 (2022). https://doi.org/10.1007/s11042-022-13215-1

    Article  Google Scholar 

  42. Merzban, M.H., Elbayoumi, M.: Efficient solution of Otsu multilevel image thresholding: A comparative study. Expert Syst. Appl. 116, 299–309 (2019). https://doi.org/10.1016/j.eswa.2018.09.008

    Article  Google Scholar 

  43. https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org

  44. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In Proc. Int. Conf. Med. image comput. comput.-assist. intervent. (MICCAI). Cham, Switzerland: Springer, pp. 234–241. (2015)

  45. https://medicis.univ-rennes1.fr/software

  46. Iglovikov, V., Shvets, A.: TernausNet: U-Net with VGG11encoder pre-trained on ImageNet for image segmentation. (2018). arXiv:1801.05746

  47. Oktay, O. et al.: Attention U-Net: Learning where to look for the pancreas. (2018). arXiv:1804.03999

  48. Hasan, S. M. K., Linte, C. A.: U-NetPlus: A modified encoder–decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 7205–7211. (2019)

  49. Y. Jin, Y., Cheng, K., Dou, Q., Heng, P-A.: Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 440–448. (2019)

  50. Ni, Z-L., Bian, G-B., Xie, X-L., Hou, Z-G., Zhou, X-H., Zhou, Y-J.: RASNet: Segmentation for tracking surgical instruments in surgical videos using refined attention segmentation network. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 5735–5738. (2019)

  51. Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imag. 39(6), 1856–1867 (2020)

    Article  Google Scholar 

  52. Liu, D., et al.: Unsupervised surgical instrument segmentation via anchor generation and semantic diffusion. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 657–667. (2020)

  53. Cao, H. et al.: Swin-UNet: UNet-like pure transformer for medical image segmentation. (2021). arXiv: 2105.05537.

  54. Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310

    Article  Google Scholar 

  55. Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bionics 4(3), 696–707 (2022)

    Article  Google Scholar 

  56. Wu, H., Chen, S., Chen, G., Wang, W., Lei, B., Wen, Z.: FAT-Net: feature adaptive transformers for automated skin lesion segmentation. Med. Image Anal. 76, 102327 (2022). https://doi.org/10.1016/j.media.2021.102327.

    Article  Google Scholar 

  57. Wang, Z., Li, Z., Yu, X., Jia, Z., Xu, X., Schuller, B.W.: Cross-scene semantic segmentation for medical surgical instruments using structural similarity-based partial activation networks. IEEE Trans. Med. Robot. Bion. 6(2), 399–409 (2024). https://doi.org/10.1109/TMRB.2024.3359303

    Article  Google Scholar 

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this Manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Bakiya. K and S. Nickolas contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Bakiya. K and S. Nickolas. The first draft of the manuscript was written by Bakiya. K and S. Nickolas commented on previous versions of the manuscript. Bakiya. K and S. Nickolas read and approved the final manuscript.

Corresponding author

Correspondence to K. Bakiya.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethics approval and Consent to participate

This study, “Transfer Learning for Surgical Instrument Segmentation,” used publicly available images from Endovis 2017 and Neuro Surgical Tools (NST) databases, which did not require additional ethics approval or individual consent due to their intended research purpose.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bakiya, K., Savarimuthu, N. Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification. SIViP 18, 8061–8076 (2024). https://doi.org/10.1007/s11760-024-03451-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-024-03451-3

Keywords

Navigation