Self-Supervised Video Desmoking for Laparoscopic Surgery

Wu, Renlong; Zhang, Zhilu; Zhang, Shuohao; Gou, Longfei; Chen, Haobin; Zhang, Lei; Chen, Hao; Zuo, Wangmeng

doi:10.1007/978-3-031-73220-1_18

Renlong Wu¹³,
Zhilu Zhang¹³,
Shuohao Zhang¹³,
Longfei Gou¹⁴,
Haobin Chen¹⁴,
Lei Zhang¹⁵,
Hao Chen¹⁴ &
…
Wangmeng Zuo¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15130))

Included in the following conference series:

European Conference on Computer Vision

353 Accesses

Abstract

Due to the difficulty of collecting real paired data, most existing desmoking methods train the models by synthesizing smoke, generalizing poorly to real surgical scenarios. Although a few works have explored single-image real-world desmoking in unpaired learning manners, they still encounter challenges in handling dense smoke. In this work, we address these issues together by introducing the self-supervised surgery video desmoking (SelfSVD). On the one hand, we observe that the frame captured before the activation of high-energy devices is generally clear (named pre-smoke frame, PS frame), thus it can serve as supervision for other smoky frames, making real-world self-supervised video desmoking practically feasible. On the other hand, in order to enhance the desmoking performance, we further feed the valuable information from PS frame into models, where a masking strategy and a regularization term are presented to avoid trivial solutions. In addition, we construct a real surgery video dataset for desmoking, which covers a variety of smoky scenes. Extensive experiments on the dataset show that our SelfSVD can remove smoke more effectively and efficiently while recovering more photo-realistic details than the state-of-the-art methods. The dataset, codes, and pre-trained models are available at https://github.com/ZcsrenlongZ/SelfSVD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8465; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A New Benchmark In Vivo Paired Dataset for Laparoscopic Image De-smoking

DeSmoke-LAP: improved unpaired image-to-image translation for desmoking in laparoscopic surgery

Article Open access 30 March 2022

Multi-stages de-smoking model based on CycleGAN for surgical de-smoking

Article 05 June 2023

References

Azam, M.A., Khan, K.B., Rehman, E., Khan, S.U.: Smoke removal and image enhancement of laparoscopic images by an artificial multi-exposure image fusion method. Soft. Comput. 26(16), 8003–8015 (2022)
Article Google Scholar
Bhat, G., Danelljan, M., Van Gool, L., Timofte, R.: Deep burst super-resolution. In: CVPR (2021)
Google Scholar
Bishop, C.M., Nasrabadi, N.M.: Pattern Recognition and Machine Learning, vol. 4. Springer, New York (2006)
Google Scholar
Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., Zelnik-Manor, L.: The 2018 PIRM challenge on perceptual image super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Google Scholar
Cai, B., Xu, X., Jia, K., Qing, C., Tao, D.: Dehazenet: an end-to-end system for single image haze removal. TIP (2016)
Google Scholar
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: the search for essential components in video super-resolution and beyond. In: CVPR (2021)
Google Scholar
Chan, K.C., Zhou, S., Xu, X., Loy, C.C.: Basicvsr++: improving video super-resolution with enhanced propagation and alignment. In: CVPR (2022)
Google Scholar
Chen, L., Tang, W., John, N.W., Wan, T.R., Zhang, J.J.: De-smokegcn: generative cooperative networks for joint surgical smoke detection and removal. T-MI (2019)
Google Scholar
Chen, Z., Wang, Y., Yang, Y., Liu, D.: PSD: principled synthetic-to-real dehazing guided by physical priors. In: CVPR (2021)
Google Scholar
Choi, L.K., You, J., Bovik, A.C.: Referenceless prediction of perceptual fog density and perceptual image defogging. TIP (2015)
Google Scholar
Dong, H., et al.: Multi-scale boosted dehazing network with dense feature fusion. In: CVPR (2020)
Google Scholar
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Article MathSciNet Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Dudhane, A., Zamir, S.W., Khan, S., Khan, F.S., Yang, M.H.: Burst image restoration and enhancement. In: CVPR (2022)
Google Scholar
Engin, D., Genç, A., Kemal Ekenel, H.: Cycle-dehaze: enhanced cyclegan for single image dehazing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 825–833 (2018)
Google Scholar
Fan, J., Guo, F., Qian, J., Li, X., Li, J., Yang, J.: Non-aligned supervision for real image dehazing. arXiv preprint arXiv:2303.04940 (2023)
Goodfellow, I., et al.: Generative adversarial nets. NeurIPS (2014)
Google Scholar
Gu, L., Liu, P., Jiang, C., Luo, M., Xu, Q.: Virtual digital defogging technology improves laparoscopic imaging quality. Surg. Innovation 22(2), 171–176 (2015)
Article Google Scholar
Guo, C.L., Yan, Q., Anwar, S., Cong, R., Ren, W., Li, C.: Image dehazing transformer with transmission-aware 3d position embedding. In: CVPR (2022)
Google Scholar
Guo, Y., et al.: Dadfnet: dual attention and dual frequency-guided dehazing network for video-empowered intelligent transportation. arXiv preprint arXiv:2304.09588 (2023)
He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. TPAMI (2010)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Holl, P., Koltun, V., Um, K., Thuerey, N.: phiflow: a differentiable PDE solving framework for deep learning via physical simulations. In: NeurIPS workshop, vol. 2 (2020)
Google Scholar
Hong, T., et al.: MARS-GAN: multilevel-feature-learning attention-aware based generative adversarial network for removing surgical smoke. IEEE Trans. Med. Imaging 42(8), 2299–2312 (2023). https://doi.org/10.1109/TMI.2023.3245298
Article Google Scholar
Huynh-Thu, Q., Ghanbari, M.: Scope of validity of PSNR in image/video quality assessment. Electron. Lett. (2008)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, B., Peng, X., Wang, Z., Xu, J., Feng, D.: Aod-net: all-in-one dehazing network. In: ICCV (2017)
Google Scholar
Li, B., Peng, X., Wang, Z., Xu, J., Feng, D.: End-to-end united video dehazing and detection. In: AAAI (2018)
Google Scholar
Li, B., Gou, Y., Gu, S., Liu, J.Z., Zhou, J.T., Peng, X.: You only look yourself: unsupervised and untrained single image dehazing neural network. Int. J. Comput. Vis. 129, 1754–1767 (2021)
Article Google Scholar
Li, B., Gou, Y., Liu, J.Z., Zhu, H., Zhou, J.T., Peng, X.: Zero-shot image dehazing. IEEE Trans. Image Process. 29, 8457–8466 (2020)
Article Google Scholar
Li, J., Li, Y., Zhuo, L., Kuang, L., Yu, T.: Usid-net: unsupervised single image dehazing network via disentangled representations. IEEE Trans. Multimedia (2022)
Google Scholar
Li, Y., Ren, D., Shu, X., Zuo, W.: Learning single image defocus deblurring with misaligned training pairs. In: AAAI (2023)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: ICCV (2021)
Google Scholar
Lin, J., et al.: A desmoking algorithm for endoscopic images based on improved u-net model. Concurrency Comput. Pract. Exp. 33(22), e6320 (2021)
Article Google Scholar
Liu, Y., Wan, L., Fu, H., Qin, J., Zhu, L.: Phase-based memory network for video dehazing. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 5427–5435 (2022)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
Loukas, C.: Video content analysis of surgical procedures. Surg. Endosc. 32, 553–568 (2018)
Article Google Scholar
Ma, L., Song, H., Zhang, X., Liao, H.: A smoke removal method based on combined data and modified u-net for endoscopic images. In: EMBC (2021)
Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: ICCV (2017)
Google Scholar
Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind’’ image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2012)
Article Google Scholar
Pan, Y., Bano, S., Vasconcelos, F., Park, H., Jeong, T.T., Stoyanov, D.: Desmoke-lap: improved unpaired image-to-image translation for desmoking in laparoscopic surgery. IJCARS (2022)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. NeurIPS (2019)
Google Scholar
Qiu, Y., Zhang, K., Wang, C., Luo, W., Li, H., Jin, Z.: Mb-taylorformer: multi-branch efficient transformer expanded by taylor formula for image dehazing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12802–12813 (2023)
Google Scholar
Ren, W., Zhang, J., Xu, X., Ma, L., Cao, X., Meng, G., Liu, W.: Deep video dehazing with semantic segmentation. TIP (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Salazar-Colores, S., Jimenez, H.M., Ortiz-Echeverri, C.J., Flores, G.: Desmoking laparoscopy surgery images using an image-to-image translation guided by an embedded dark channel. Access (2020)
Google Scholar
Salazar-Colores, S., Alberto-Moreno, H., Ortiz-Echeverri, C.J., Flores, G.: Desmoking laparoscopy surgery images using an image-to-image translation guided by an embedded dark channel (2020)
Google Scholar
Sengar, V., Seemakurthy, K., Gubbi, J., P, B.: Multi-task learning based approach for surgical video desmoking. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–9 (2021)
Google Scholar
Shyam, P., Yoon, K.J., Kim, K.S.: Towards domain invariant single image dehazing. In: AAAI (2021)
Google Scholar
Su, X., Wu, Q.: Multi-stages de-smoking model based on cyclegan for surgical de-smoking. Int. J. Mach. Learn. Cybern. 1–14 (2023)
Google Scholar
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: ICCV (2018)
Google Scholar
Tchaka, K., Pawar, V.M., Stoyanov, D.: Chromaticity based smoke removal in endoscopic images. In: Medical Imaging 2017: Image Processing (2017)
Google Scholar
Venkatesh, V., Sharma, N., Srivastava, V., Singh, M.: Unsupervised smoke to desmoked laparoscopic surgery images using contrast driven cyclic-desmokegan. Comput. Biol. Med. (2020)
Google Scholar
Wang, C., Alaya Cheikh, F., Kaaniche, M., Beghdadi, A., Elle, O.J.: Variational based smoke removal in laparoscopic images. BEO (2018)
Google Scholar
Wang, C., Mohammed, A.K., Cheikh, F.A., Beghdadi, A., Elle, O.J.: Multiscale deep desmoking for laparoscopic surgery. In: Medical Imaging 2019: Image Processing, vol. 10949, pp. 505–513. SPIE (2019)
Google Scholar
Wang, F., Sun, X., Li, J.: Surgical smoke removal via residual swin transformer network. IJCARS (2023)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP (2004)
Google Scholar
Wu, R., Zhang, Z., Zhang, S., Zhang, H., Zuo, W.: RBSR: efficient and flexible recurrent network for burst super-resolution. In: PRCV (2023)
Google Scholar
Xiao, B., Zheng, Z., Chen, X., Lv, C., Zhuang, Y., Wang, T.: Single UHD image dehazing via interpretable pyramid network (2022)
Google Scholar
Xu, J., et al.: Video dehazing via a multi-range temporal alignment network with physical prior. In: CVPR (2023)
Google Scholar
Yang, X., Xu, Z., Luo, J.: Towards perceptual image dehazing by physics-based disentanglement and adversarial training. In: AAAI (2018)
Google Scholar
Yang, Y., Wang, C., Liu, R., Zhang, L., Guo, X., Tao, D.: Self-augmented unpaired image dehazing via density and depth decomposition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2037–2046 (2022)
Google Scholar
Zhang, X., et al.: Learning to restore hazy video: a new real-world dataset and a new method. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9239–9248 (2021)
Google Scholar
Zhang, X., Chen, Q., Ng, R., Koltun, V.: Zoom to learn, learn to zoom. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3762–3770 (2019)
Google Scholar
Zhang, Z., Wang, R., Zhang, H., Chen, Y., Zuo, W.: Self-supervised learning for real-world super-resolution from dual zoomed observations. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13678, pp. 610–627. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19797-0_35
Chapter Google Scholar
Zhao, S., Zhang, L., Shen, Y., Zhou, Y.: Refinednet: a weakly supervised refinement framework for single image dehazing. TIP (2021)
Google Scholar
Zheng, Q., et al.: Development and validation of a deep learning-based laparoscopic system for improving video quality. IJCARS (2023)
Google Scholar
Zheng, Y., Zhan, J., He, S., Dong, J., Du, Y.: Curricular contrastive regularization for physics-aware single image dehazing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5785–5794 (2023)
Google Scholar
Zhou, Y., Hu, Z., Xuan, Z., Wang, Y., Hu, X.: Synchronizing detection and removal of smoke in endoscopic images with cyclic consistency adversarial nets. IEEE/ACM Trans. Comput. Biol. Bioinform. 1–12 (2022). https://doi.org/10.1109/TCBB.2022.3204673
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar
Zhu, Q., Mai, J., Shao, L.: A fast single image haze removal algorithm using color attenuation prior. TIP (2015)
Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant No. 62371164 and No. U22B2035.

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, China
Renlong Wu, Zhilu Zhang, Shuohao Zhang & Wangmeng Zuo
Southern Medical University, Guangzhou, China
Longfei Gou, Haobin Chen & Hao Chen
Hong Kong Polytechnic University, Hung Hom, China
Lei Zhang

Authors

Renlong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhilu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shuohao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Longfei Gou
View author publications
You can also search for this author in PubMed Google Scholar
Haobin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wangmeng Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Chen .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 14720 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, R. et al. (2025). Self-Supervised Video Desmoking for Laparoscopic Surgery. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15130. Springer, Cham. https://doi.org/10.1007/978-3-031-73220-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-73220-1_18
Published: 03 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73219-5
Online ISBN: 978-3-031-73220-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Self-Supervised Video Desmoking for Laparoscopic Surgery