Abstract
Minimally invasive surgeries reduce blood loss and quicker recovery times for patients compared to open surgeries. Equipped with high-definition 3D cameras, robotic surgical systems offer an enhanced visual perspective, empowering surgeons to make informed decisions and reduce damage to healthy tissues around the surgical site. At this juncture, emphasizing the importance of precise localization of surgical instruments in robotic-assisted surgeries is of utmost significance. The primary intention of this paper is to address the challenge of achieving accurate localization of surgical tools within registered tissue to establish clear marginal boundaries and prevent tissue tearing. So, surgeons can reduce the risk of human error with extremely precise and steady movements. In this paper, we proposed a binary segmentation model complemented with a customized U-Net architecture to ensure precise inference in surgical instrument segmentation. Additionally, the accuracy was enhanced through the incorporation of Red channel amplification and the Otsu thresholding approach. The effectiveness of our proposed Modified U-Net is validated through experiments conducted on both the publicly available surgical instrument segmentation dataset, MICCAI 2017 EndoVis Challenge (EndoVis2017), and Neuro surgical tools (NST) dataset consisting of both brain and spine tumor removal procedures. The results presented compelling evidence of the superior performance attained through the integration of the modified U-Net with Red channel amplification and Otsu thresholding with remarkable outcomes including 99.5% accuracy and a Dice score of 0.9846 in binary segmentation, 99.41% accuracy, and a Dice score of 0.9751 in part segmentation, and 99.35% accuracy and a Dice score of 0.9726 in type segmentation. We presented both quantitative and qualitative inference results for NST dataset samples. These results were obtained by leveraging a transfer learning approach, where the model was trained on the EndoVis2017 dataset using twelve training iterations. Inference results demonstrate that combining red channel amplification with Otsu thresholding significantly improves surgical instrument segmentation in the NST dataset.
Similar content being viewed by others
Data availability
Endovis 2017 Dataset is available at https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org NST Dataset is available at https://medicis.univ-rennes1.fr/software.
Code availability
The Code that support the research findings of this study are available upon request from the authors. NST Dataset is available at https://medicis.univ-rennes1.fr/software.
References
Liu, J., Guo, X., Yuan, Y.: Graph-based surgical instrument adaptive segmentation via domain-common knowledge. IEEE Trans. Med. Imaging 41(3), 715–726 (2022). https://doi.org/10.1109/TMI.2021.3121138
Hussain, S.M., Brunetti, A., Lucarelli, G., Memeo, R., Bevilacqua, V., Buongiorno, D.: Deep learning based image processing for robot assisted surgery: a systematic literature survey. IEEE Access 10, 122627–122657 (2022). https://doi.org/10.1109/ACCESS.2022.3223704
Qiu, L., Ren, H.: Endoscope navigation with SLAM-based registration to computed tomography for transoral surgery. Int. J. Intell. Robot Appl. 4, 252–263 (2020). https://doi.org/10.1007/s41315-020-00127-2
Ren, H., Li, C., Qiu, L., Lim, C Ming: 38-ACTORS: adaptive and compliant transoral robotic surgery with flexible manipulators and intelligent guidance. In: Abedin-Nasab, H.H. (ed.) Handbook of robotic and image-guided surgery. Elsevier, New Jersey (2020)
Srivastava, A.K., Singhvi, S., Qiu, L., et al.: Image guided navigation utilizing intra-operative 3D surface scanning to mitigate morphological deformation of surface anatomy. J. Med. Biol. Eng. 39, 932–943 (2019). https://doi.org/10.1007/s40846-019-00475-w
Gao, H., et al.: SAVAnet: surgical action-driven visual attention network for autonomous endoscope control. IEEE Trans. Autom. Sci. Eng. 20(4), 2655–2667 (2023). https://doi.org/10.1109/TASE.2022.3203631
Dhamija, T., Gupta, A., Gupta, S., et al.: Semantic segmentation in medical images through transfused convolution and transformer networks. Appl. Intell. 53, 1132–1148 (2023). https://doi.org/10.1007/s10489-022-03642-w
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Sandler M et al.: (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: proceedings of the IEEE conference on computer vision and pattern recognition; piscataway: IEEE, pp. 4510–4520. (2018)
Liu, Z et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF international conference on computer vision (ICCV), Montreal, QC, Canada, pp. 9992–10002, (2021) https://doi.org/10.1109/ICCV48922.2021.00986.
Huang, B., et al.: Simultaneous depth estimation and surgical tool segmentation in laparoscopic images. IEEE Trans. Med. Robot. Bionic. 4(2), 335–338 (2022). https://doi.org/10.1109/TMRB.2022.3170215
Yang, L., Wang, H., Gu, Y., Bian, G., Liu, Y., Yu, H.: TMA-Net: A Transformer-based multi-scale attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bionic. 5(2), 323–334 (2023). https://doi.org/10.1109/TMRB.2023.3269856
Lou, A., Tawfik, K., Yao, X., Liu, Z., Noble, J.: Min-max similarity: a contrastive semi-supervised deep learning network for surgical tools segmentation. IEEE Trans. Med. Imaging 42(10), 2832–2841 (2023). https://doi.org/10.1109/TMI.2023.3266137
Kong, X., Jin, Y., Dou, Q., et al.: Accurate instance segmentation of surgical instruments in robotic surgery: model refinement and cross-dataset evaluation. Int J CARS 16, 1607–1614 (2021). https://doi.org/10.1007/s11548-021-02438-6
Allan, M., Shvets, A., Kurmann, T., Zhang, Z., Duggal, R., Su, Y. et al.: 2017 robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 (2019)
Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., Jannin, P.: Detecting surgical tools by modelling local appearance and global shape. IEEE Trans. Med. Imaging 34(12), 2603–2617 (2015). https://doi.org/10.1109/TMI.2015.2450831
Yao, R., Lin, G., Xia, S., Zhao, J., Zhou, Y.: Video object segmentation and tracking: a survey. ACM Trans. Intell. Syst. Technol. 11(4), 47 (2020). https://doi.org/10.1145/3391743
Zhou, T., Porikli, F., Crandall, D.J., Van Gool, L., Wang, W.: A survey on deep learning technique for video segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7099–7122 (2023). https://doi.org/10.1109/TPAMI.2022.3225573
Qiu, L., Li, C., Ren, H.: Real-time surgical instrument tracking in robot-assisted surgery using multi-domain convolutional neural network. Healthc. Technol. Lett. 6, 159–164 (2019). https://doi.org/10.1049/htl.2019.0068
Nadeau, C., Ren, H., Krupa, A., Dupont, P.: Intensity-based visual servoing for instrument and tissue tracking in 3D ultrasound volumes. IEEE Trans. Autom. Sci. Eng. 12(1), 367–371 (2015). https://doi.org/10.1109/TASE.2014.2343652
Du, X., et al.: Articulated multi-instrument 2-D pose estimation using fully convolutional networks. IEEE Trans. Med. Imaging 37(5), 1276–1287 (2018). https://doi.org/10.1109/TMI.2017.2787672
Allan, M., Ourselin, S., Hawkes, D.J., Kelly, J.D., Stoyanov, D.: 3-D Pose estimation of articulated instruments in robotic minimally invasive surgery. IEEE Trans. Med. Imaging 37(5), 1204–1213 (2018). https://doi.org/10.1109/TMI.2018.2794439
Islam, M., Atputharuban, D.A., Ramesh, R., Ren, H.: Real-Time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot. Autom. Lett. 4(2), 2188–2195 (2019). https://doi.org/10.1109/LRA.2019.2900854
Yu, L., Wang, P., Yu, X., et al.: A holistically-nested U-Net: surgical instrument segmentation based on convolutional neural network. J. Digit. Imaging 33, 341–347 (2020). https://doi.org/10.1007/s10278-019-00277-1
Bouarfa, L., Akman, O., Schneider, A., Jonker, P.P., Dankelman, J.: In-vivo real-time tracking of surgical instruments in endoscopic video. Minim. Invasive Ther. Allied Technol. 21(3), 129–134 (2012). https://doi.org/10.3109/13645706.2011.580764
Qiu, L., Ren, H.: RSegNet: a joint learning framework for deformable registration and segmentation. IEEE Trans. Autom. Sci. Eng. 19(3), 2499–2513 (2022). https://doi.org/10.1109/TASE.2021.3087868
Sahu, M., Mukhopadhyay, A., Zachow, S.: Simulation-to-real domain adaptation with teacher–student learning for endoscopic instrument segmentation. Int J CARS 16, 849–859 (2021). https://doi.org/10.1007/s11548-021-02383-4
Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bion. 4(3), 696–707 (2022). https://doi.org/10.1109/TMRB.2022.3193420
Wang, L., Zhou, C., Cao, Y., Zhao, R., Xu, K.: Vision-based markerless tracking for continuum surgical instruments in robot-assisted minimally invasive surgery. IEEE Robot. Autom. Lett. 8(11), 7202–7209 (2023). https://doi.org/10.1109/LRA.2023.3315229
Zhao, Z., Chen, Z., Voros, S., Cheng, X.: Real-time tracking of surgical instruments based on spatio-temporal context and deep learning. Comput. Assist. Surg. 24(sup1), 20–29 (2019). https://doi.org/10.1080/24699322.2018.1560097
Lin, S., Qin, F., Peng, H., Bly, R.A., Moe, K.S., Hannaford, B.: Multi-frame feature aggregation for real-time instrument segmentation in endoscopic video. IEEE Robot. Autom. Lett. 6(4), 6773–6780 (2021). https://doi.org/10.1109/LRA.2021.3096156
Wang, X., et al.: PaI-Net: a modified u-net of reducing semantic gap for surgical instrument segmentation. IET Image Process. 15, 2959–2969 (2021). https://doi.org/10.1049/ipr2.12283
Yang, L., Gu, Y., Bian, G., Liu, Y.: TMF-Net: a transformer-based multiscale fusion network for surgical instrument segmentation from endoscopic images. IEEE Trans. Instrum. Meas. 72, 1–15 (2023). https://doi.org/10.1109/TIM.2022.3225922
Huang, K., Chitrakar, D., Jiang, W., Yung, I., Yun-Hsuan, S.: Surgical tool segmentation with pose-informed morphological polar transform of endoscopic images. J. Med. Robot. Res. 07(0n203), 2241003 (2022). https://doi.org/10.1142/S2424905X22410033
Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310
Cerón, J.C.Á., Ruiz, G.O., Chang, L., Ali, S.: Real-time instance segmentation of surgical instruments using attention and multi-scale feature fusion. Med. Image Anal. 81, 102569 (2022). https://doi.org/10.1016/j.media.2022.102569
Ni, Z.-L., Bian, G.-B., Li, Z., Zhou, X.-H., Li, R.-Q., Hou, Z.-G.: Space squeeze reasoning and low-rank bilinear feature fusion for surgical image segmentation. IEEE J. Biomed. Health Inform. 26(7), 3209–3217 (2022). https://doi.org/10.1109/JBHI.2022.3154925
Yang, L., Yuge, G., Bian, G., Liu, Y.: An attention-guided network for surgical instrument segmentation from endoscopic images. Comput. Biol. Med. 151, 106216 (2022). https://doi.org/10.1016/j.compbiomed.2022.106216
Yang, L., Yuge, G., Bian, G., Liu, Y.: MAF-Net: a multi-scale attention fusion network for automatic surgical instrument segmentation. Biomed. Signal Proc. Control 85, 104912 (2023). https://doi.org/10.1016/j.bspc.2023.104912
Yang, L., Wang, H., Bian, G., Liu, Y.: HCTA-Net: a hybrid CNN-transformer attention network for surgical instrument segmentation. IEEE Trans. Med. Robot. Bion. 5(4), 929–944 (2023). https://doi.org/10.1109/TMRB.2023.3315479
Nyo, M.T., Mebarek-Oudina, F., Hlaing, S.S., Khan, N.A.: Otsu’s thresholding technique for MRI image brain tumor segmentation. Multimed. Tools Appl. 81, 43837–43849 (2022). https://doi.org/10.1007/s11042-022-13215-1
Merzban, M.H., Elbayoumi, M.: Efficient solution of Otsu multilevel image thresholding: A comparative study. Expert Syst. Appl. 116, 299–309 (2019). https://doi.org/10.1016/j.eswa.2018.09.008
https://endovissub2017-roboticinstrumentsegmentation.grandchallenge.org
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In Proc. Int. Conf. Med. image comput. comput.-assist. intervent. (MICCAI). Cham, Switzerland: Springer, pp. 234–241. (2015)
Iglovikov, V., Shvets, A.: TernausNet: U-Net with VGG11encoder pre-trained on ImageNet for image segmentation. (2018). arXiv:1801.05746
Oktay, O. et al.: Attention U-Net: Learning where to look for the pancreas. (2018). arXiv:1804.03999
Hasan, S. M. K., Linte, C. A.: U-NetPlus: A modified encoder–decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 7205–7211. (2019)
Y. Jin, Y., Cheng, K., Dou, Q., Heng, P-A.: Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 440–448. (2019)
Ni, Z-L., Bian, G-B., Xie, X-L., Hou, Z-G., Zhou, X-H., Zhou, Y-J.: RASNet: Segmentation for tracking surgical instruments in surgical videos using refined attention segmentation network. In Proc. 41st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), pp. 5735–5738. (2019)
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imag. 39(6), 1856–1867 (2020)
Liu, D., et al.: Unsupervised surgical instrument segmentation via anchor generation and semantic diffusion. In Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI). Cham, Switzerland: Springer, pp. 657–667. (2020)
Cao, H. et al.: Swin-UNet: UNet-like pure transformer for medical image segmentation. (2021). arXiv: 2105.05537.
Ni, Z.-L., Zhou, X.-H., Wang, G.-A., Yue, W.-Q., Li, Z., Bian, G.-B., Hou, Z.-G.: SurgiNet: pyramid attention aggregation and class-wise self-distillation for surgical instrument segmentation. Med. Image Anal. 76, 102310 (2022). https://doi.org/10.1016/j.media.2021.102310
Yang, L., Gu, Y., Bian, G., Liu, Y.: DRR-Net: a dense-connected residual recurrent convolutional network for surgical instrument segmentation from endoscopic images. IEEE Trans. Med. Robot. Bionics 4(3), 696–707 (2022)
Wu, H., Chen, S., Chen, G., Wang, W., Lei, B., Wen, Z.: FAT-Net: feature adaptive transformers for automated skin lesion segmentation. Med. Image Anal. 76, 102327 (2022). https://doi.org/10.1016/j.media.2021.102327.
Wang, Z., Li, Z., Yu, X., Jia, Z., Xu, X., Schuller, B.W.: Cross-scene semantic segmentation for medical surgical instruments using structural similarity-based partial activation networks. IEEE Trans. Med. Robot. Bion. 6(2), 399–409 (2024). https://doi.org/10.1109/TMRB.2024.3359303
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this Manuscript.
Author information
Authors and Affiliations
Contributions
Bakiya. K and S. Nickolas contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Bakiya. K and S. Nickolas. The first draft of the manuscript was written by Bakiya. K and S. Nickolas commented on previous versions of the manuscript. Bakiya. K and S. Nickolas read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Ethics approval and Consent to participate
This study, “Transfer Learning for Surgical Instrument Segmentation,” used publicly available images from Endovis 2017 and Neuro Surgical Tools (NST) databases, which did not require additional ethics approval or individual consent due to their intended research purpose.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bakiya, K., Savarimuthu, N. Transfer learning for surgical instrument segmentation in open surgery videos: a modified u-net approach with channel amplification. SIViP 18, 8061–8076 (2024). https://doi.org/10.1007/s11760-024-03451-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-024-03451-3