Abstract
This paper reports a CPU-level real-time stereo matching method for surgical images (10 Hz on \(640 \times 480\) image with a single core of i5-9400). The proposed method is built on the fast LK algorithm, which estimates the disparity of the stereo images patch-wisely and in a coarse-to-fine manner. We propose a Bayesian framework to evaluate the probability of the optimized patch disparity at different scales. Moreover, we introduce a spatial Gaussian mixed probability distribution to address the pixel-wise probability within the patch. In-vivo and synthetic experiments show that our method can handle ambiguities resulted from the textureless surfaces and the photometric inconsistency caused by the non-Lambertian reflectance. Our Bayesian method correctly balances the probability of the patch for stereo images at different scales. Experiments indicate that the estimated depth has similar accuracy and fewer outliers than the baseline methods in the surgical scenario with real-time performance. The code and data set are available at https://github.com/JingweiSong/BDIS.git.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Readers are encouraged to watch the attached video and test the code.
- 2.
References
Allan, M., et al.: Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:2101.01133 (2021)
Andrew, A.M.: Multiple view geometry in computer vision. Kybernetes (2001)
Brandao, P., Psychogyios, D., Mazomenos, E., Stoyanov, D., Janatka, M.: HAPNet: hierarchically aggregated pyramid network for real-time stereo matching. Comput. Methods Biomech. Biomed. Eng. Imaging Visual. 1–6 (2020)
Cartucho, J., Tukra, S., Li, Y.S. Elson, D., Giannarou, S.: VisionBlender: a tool to efficiently generate computer vision datasets for robotic surgery. Comput. Methods Biomech. Biomed. Eng. Imaging Visual. 1–8 (2020)
Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Chen, X., Wang, Y., Chen, X., Zeng, W.: S2R-DepthNet: learning a generalizable depth-specific structural representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3034–3043 (2021)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)
Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19315-6_3
Giannarou, S., Visentini-Scarzanella, M., Yang, G.Z.: Probabilistic tracking of affine-invariant anisotropic regions. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 130–143 (2013)
Guo, X., Yang, K., Yang, W., Wang, X., Li, H.: Group-wise correlation stereo network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3273–3282 (2019)
Haouchine, N., Dequidt, J., Peterlik, I., Kerrien, E., Berger, M.O., Cotin, S.: Image-guided simulation of heterogeneous tissue deformation for augmented reality during hepatic surgery. In: 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 199–208. IEEE (2013)
Hirschmuller, H.: Accurate and efficient stereo processing by semi-global matching and mutual information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 807–814. IEEE (2005)
Jia, X., et al.: Automatic polyp recognition in colonoscopy images using deep learning and two-stage pyramidal feature prediction. IEEE Trans. Autom. Sci. Eng. 17(3), 1570–1584 (2020)
Kroeger, T., Timofte, R., Dai, D., Van Gool, L.: Fast optical flow using dense inverse search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 471–488. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_29
Larochelle, H., Bengio, Y.: Classification using discriminative restricted boltzmann machines, pp. 536–543 (2008)
Long, Y., et al.: E-DSSR: efficient dynamic surgical scene reconstruction with transformer-based stereoscopic depth perception. arXiv preprint arXiv:2107.00229 (2021)
Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision. Vancouver, British Columbia (1981)
Mahmood, F., Yang, Z., Chen, R., Borders, D., Xu, W., Durr, N.J.: Polyp segmentation and classification using predicted depth from monocular endoscopy. In: Medical Imaging 2019: Computer-Aided Diagnosis, vol. 10950, p. 1095011. International Society for Optics and Photonics (2019)
Pratt, P., Bergeles, C., Darzi, A., Yang, G.-Z.: Practical intraoperative stereo camera calibration. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8674, pp. 667–675. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10470-6_83
Rappel, J.K.: Surgical stereo vision systems and methods for microsurgery. US Patent 9,330,477, 3 May 2016
Shimasaki, Y., Iwahori, Y., Neog, D.R., Woodham, R.J., Bhuyan, M.: Generating Lambertian image with uniform reflectance for endoscope image. In: IWAIT 2013, pp. 1–6 (2013)
Song, J., Patel, M., Girgensohn, A., Kim, C.: Combining deep learning with geometric features for image-based localization in the gastrointestinal tract. Expert Syst. Appl. 115631 (2021)
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: Dynamic reconstruction of deformable soft-tissue with stereo scope in minimal invasive surgery. IEEE Robot. Autom. Lett. 3(1), 155–162 (2017)
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: MIS-SLAM: real-time large-scale dense deformable SLAM system in minimal invasive surgery based on heterogeneous computing. IEEE Robot. Autom. Lett. 3(4), 4068–4075 (2018)
Stoyanov, D., Scarzanella, M.V., Pratt, P., Yang, G.-Z.: Real-time stereo reconstruction in robotically assisted minimally invasive surgery. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010. LNCS, vol. 6361, pp. 275–282. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15705-9_34
Turan, M., Almalioglu, Y., Araujo, H., Konukoglu, E., Sitti, M.: Deep endovo: a recurrent convolutional neural network (RCNN) based visual odometry approach for endoscopic capsule robots. Neurocomputing 275, 1861–1870 (2018). https://doi.org/10.1016/j.neucom.2017.10.014, http://www.sciencedirect.com/science/article/pii/S092523121731665X
Uzunbas, M.G., Chen, C., Metaxas, D.: An efficient conditional random field approach for automatic and interactive neuron segmentation. Med. Image Anal. 27, 31–44 (2016)
Widya, A.R., Monno, Y., Imahori, K., Okutomi, M., Suzuki, S., Gotoda, T., Miki, K.: 3D reconstruction of whole stomach from endoscope video using structure-from-motion. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3900–3904. IEEE (2019)
Yang, G., Manela, J., Happold, M., Ramanan, D.: Hierarchical deep stereo matching on high-resolution images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5515–5524 (2019)
Ye, M., Johns, E., Handa, A., Zhang, L., Pratt, P., Yang, G.Z.: Self-supervised Siamese learning on stereo image pairs for depth estimation in robotic surgery. arXiv preprint arXiv:1705.08260 (2017)
Zampokas, G., Tsiolis, K., Peleka, G., Mariolis, I., Malasiotis, S., Tzovaras, D.: Real-time 3D reconstruction in minimally invasive surgery with quasi-dense matching. In: 2018 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–6. IEEE (2018)
Zhan, J., Cartucho, J., Giannarou, S.: Autonomous tissue scanning under free-form motion for intraoperative tissue characterisation. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 11147–11154. IEEE (2020)
Zhang, L., Ye, M., Giataganas, P., Hughes, M., Yang, G.Z.: Autonomous scanning for endomicroscopic mosaicing and 3D fusion. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 3587–3593. IEEE (2017)
Zheng, C., Cham, T.J., Cai, J.: T2net: synthetic-to-realistic translation for solving single-image depth estimation tasks. In: Proceedings of the European Conference on Computer Vision, pp. 767–783 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Song, J., Zhu, Q., Lin, J., Ghaffari, M. (2022). Bayesian Dense Inverse Searching Algorithm for Real-Time Stereo Matching in Minimally Invasive Surgery. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-16449-1_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16448-4
Online ISBN: 978-3-031-16449-1
eBook Packages: Computer ScienceComputer Science (R0)