Self-Supervised Video Desmoking for Laparoscopic Surgery | SpringerLink
Skip to main content

Self-Supervised Video Desmoking for Laparoscopic Surgery

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15130))

Included in the following conference series:

  • 353 Accesses

Abstract

Due to the difficulty of collecting real paired data, most existing desmoking methods train the models by synthesizing smoke, generalizing poorly to real surgical scenarios. Although a few works have explored single-image real-world desmoking in unpaired learning manners, they still encounter challenges in handling dense smoke. In this work, we address these issues together by introducing the self-supervised surgery video desmoking (SelfSVD). On the one hand, we observe that the frame captured before the activation of high-energy devices is generally clear (named pre-smoke frame, PS frame), thus it can serve as supervision for other smoky frames, making real-world self-supervised video desmoking practically feasible. On the other hand, in order to enhance the desmoking performance, we further feed the valuable information from PS frame into models, where a masking strategy and a regularization term are presented to avoid trivial solutions. In addition, we construct a real surgery video dataset for desmoking, which covers a variety of smoky scenes. Extensive experiments on the dataset show that our SelfSVD can remove smoke more effectively and efficiently while recovering more photo-realistic details than the state-of-the-art methods. The dataset, codes, and pre-trained models are available at https://github.com/ZcsrenlongZ/SelfSVD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 8465
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10581
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Azam, M.A., Khan, K.B., Rehman, E., Khan, S.U.: Smoke removal and image enhancement of laparoscopic images by an artificial multi-exposure image fusion method. Soft. Comput. 26(16), 8003–8015 (2022)

    Article  Google Scholar 

  2. Bhat, G., Danelljan, M., Van Gool, L., Timofte, R.: Deep burst super-resolution. In: CVPR (2021)

    Google Scholar 

  3. Bishop, C.M., Nasrabadi, N.M.: Pattern Recognition and Machine Learning, vol. 4. Springer, New York (2006)

    Google Scholar 

  4. Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., Zelnik-Manor, L.: The 2018 PIRM challenge on perceptual image super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)

    Google Scholar 

  5. Cai, B., Xu, X., Jia, K., Qing, C., Tao, D.: Dehazenet: an end-to-end system for single image haze removal. TIP (2016)

    Google Scholar 

  6. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: Basicvsr: the search for essential components in video super-resolution and beyond. In: CVPR (2021)

    Google Scholar 

  7. Chan, K.C., Zhou, S., Xu, X., Loy, C.C.: Basicvsr++: improving video super-resolution with enhanced propagation and alignment. In: CVPR (2022)

    Google Scholar 

  8. Chen, L., Tang, W., John, N.W., Wan, T.R., Zhang, J.J.: De-smokegcn: generative cooperative networks for joint surgical smoke detection and removal. T-MI (2019)

    Google Scholar 

  9. Chen, Z., Wang, Y., Yang, Y., Liu, D.: PSD: principled synthetic-to-real dehazing guided by physical priors. In: CVPR (2021)

    Google Scholar 

  10. Choi, L.K., You, J., Bovik, A.C.: Referenceless prediction of perceptual fog density and perceptual image defogging. TIP (2015)

    Google Scholar 

  11. Dong, H., et al.: Multi-scale boosted dehazing network with dense feature fusion. In: CVPR (2020)

    Google Scholar 

  12. Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  Google Scholar 

  13. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  14. Dudhane, A., Zamir, S.W., Khan, S., Khan, F.S., Yang, M.H.: Burst image restoration and enhancement. In: CVPR (2022)

    Google Scholar 

  15. Engin, D., Genç, A., Kemal Ekenel, H.: Cycle-dehaze: enhanced cyclegan for single image dehazing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 825–833 (2018)

    Google Scholar 

  16. Fan, J., Guo, F., Qian, J., Li, X., Li, J., Yang, J.: Non-aligned supervision for real image dehazing. arXiv preprint arXiv:2303.04940 (2023)

  17. Goodfellow, I., et al.: Generative adversarial nets. NeurIPS (2014)

    Google Scholar 

  18. Gu, L., Liu, P., Jiang, C., Luo, M., Xu, Q.: Virtual digital defogging technology improves laparoscopic imaging quality. Surg. Innovation 22(2), 171–176 (2015)

    Article  Google Scholar 

  19. Guo, C.L., Yan, Q., Anwar, S., Cong, R., Ren, W., Li, C.: Image dehazing transformer with transmission-aware 3d position embedding. In: CVPR (2022)

    Google Scholar 

  20. Guo, Y., et al.: Dadfnet: dual attention and dual frequency-guided dehazing network for video-empowered intelligent transportation. arXiv preprint arXiv:2304.09588 (2023)

  21. He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. TPAMI (2010)

    Google Scholar 

  22. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  23. Holl, P., Koltun, V., Um, K., Thuerey, N.: phiflow: a differentiable PDE solving framework for deep learning via physical simulations. In: NeurIPS workshop, vol. 2 (2020)

    Google Scholar 

  24. Hong, T., et al.: MARS-GAN: multilevel-feature-learning attention-aware based generative adversarial network for removing surgical smoke. IEEE Trans. Med. Imaging 42(8), 2299–2312 (2023). https://doi.org/10.1109/TMI.2023.3245298

    Article  Google Scholar 

  25. Huynh-Thu, Q., Ghanbari, M.: Scope of validity of PSNR in image/video quality assessment. Electron. Lett. (2008)

    Google Scholar 

  26. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  27. Li, B., Peng, X., Wang, Z., Xu, J., Feng, D.: Aod-net: all-in-one dehazing network. In: ICCV (2017)

    Google Scholar 

  28. Li, B., Peng, X., Wang, Z., Xu, J., Feng, D.: End-to-end united video dehazing and detection. In: AAAI (2018)

    Google Scholar 

  29. Li, B., Gou, Y., Gu, S., Liu, J.Z., Zhou, J.T., Peng, X.: You only look yourself: unsupervised and untrained single image dehazing neural network. Int. J. Comput. Vis. 129, 1754–1767 (2021)

    Article  Google Scholar 

  30. Li, B., Gou, Y., Liu, J.Z., Zhu, H., Zhou, J.T., Peng, X.: Zero-shot image dehazing. IEEE Trans. Image Process. 29, 8457–8466 (2020)

    Article  Google Scholar 

  31. Li, J., Li, Y., Zhuo, L., Kuang, L., Yu, T.: Usid-net: unsupervised single image dehazing network via disentangled representations. IEEE Trans. Multimedia (2022)

    Google Scholar 

  32. Li, Y., Ren, D., Shu, X., Zuo, W.: Learning single image defocus deblurring with misaligned training pairs. In: AAAI (2023)

    Google Scholar 

  33. Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: image restoration using swin transformer. In: ICCV (2021)

    Google Scholar 

  34. Lin, J., et al.: A desmoking algorithm for endoscopic images based on improved u-net model. Concurrency Comput. Pract. Exp. 33(22), e6320 (2021)

    Article  Google Scholar 

  35. Liu, Y., Wan, L., Fu, H., Qin, J., Zhu, L.: Phase-based memory network for video dehazing. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 5427–5435 (2022)

    Google Scholar 

  36. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)

  37. Loukas, C.: Video content analysis of surgical procedures. Surg. Endosc. 32, 553–568 (2018)

    Article  Google Scholar 

  38. Ma, L., Song, H., Zhang, X., Liao, H.: A smoke removal method based on combined data and modified u-net for endoscopic images. In: EMBC (2021)

    Google Scholar 

  39. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: ICCV (2017)

    Google Scholar 

  40. Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind’’ image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2012)

    Article  Google Scholar 

  41. Pan, Y., Bano, S., Vasconcelos, F., Park, H., Jeong, T.T., Stoyanov, D.: Desmoke-lap: improved unpaired image-to-image translation for desmoking in laparoscopic surgery. IJCARS (2022)

    Google Scholar 

  42. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. NeurIPS (2019)

    Google Scholar 

  43. Qiu, Y., Zhang, K., Wang, C., Luo, W., Li, H., Jin, Z.: Mb-taylorformer: multi-branch efficient transformer expanded by taylor formula for image dehazing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12802–12813 (2023)

    Google Scholar 

  44. Ren, W., Zhang, J., Xu, X., Ma, L., Cao, X., Meng, G., Liu, W.: Deep video dehazing with semantic segmentation. TIP (2018)

    Google Scholar 

  45. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  46. Salazar-Colores, S., Jimenez, H.M., Ortiz-Echeverri, C.J., Flores, G.: Desmoking laparoscopy surgery images using an image-to-image translation guided by an embedded dark channel. Access (2020)

    Google Scholar 

  47. Salazar-Colores, S., Alberto-Moreno, H., Ortiz-Echeverri, C.J., Flores, G.: Desmoking laparoscopy surgery images using an image-to-image translation guided by an embedded dark channel (2020)

    Google Scholar 

  48. Sengar, V., Seemakurthy, K., Gubbi, J., P, B.: Multi-task learning based approach for surgical video desmoking. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–9 (2021)

    Google Scholar 

  49. Shyam, P., Yoon, K.J., Kim, K.S.: Towards domain invariant single image dehazing. In: AAAI (2021)

    Google Scholar 

  50. Su, X., Wu, Q.: Multi-stages de-smoking model based on cyclegan for surgical de-smoking. Int. J. Mach. Learn. Cybern. 1–14 (2023)

    Google Scholar 

  51. Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: ICCV (2018)

    Google Scholar 

  52. Tchaka, K., Pawar, V.M., Stoyanov, D.: Chromaticity based smoke removal in endoscopic images. In: Medical Imaging 2017: Image Processing (2017)

    Google Scholar 

  53. Venkatesh, V., Sharma, N., Srivastava, V., Singh, M.: Unsupervised smoke to desmoked laparoscopic surgery images using contrast driven cyclic-desmokegan. Comput. Biol. Med. (2020)

    Google Scholar 

  54. Wang, C., Alaya Cheikh, F., Kaaniche, M., Beghdadi, A., Elle, O.J.: Variational based smoke removal in laparoscopic images. BEO (2018)

    Google Scholar 

  55. Wang, C., Mohammed, A.K., Cheikh, F.A., Beghdadi, A., Elle, O.J.: Multiscale deep desmoking for laparoscopic surgery. In: Medical Imaging 2019: Image Processing, vol. 10949, pp. 505–513. SPIE (2019)

    Google Scholar 

  56. Wang, F., Sun, X., Li, J.: Surgical smoke removal via residual swin transformer network. IJCARS (2023)

    Google Scholar 

  57. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP (2004)

    Google Scholar 

  58. Wu, R., Zhang, Z., Zhang, S., Zhang, H., Zuo, W.: RBSR: efficient and flexible recurrent network for burst super-resolution. In: PRCV (2023)

    Google Scholar 

  59. Xiao, B., Zheng, Z., Chen, X., Lv, C., Zhuang, Y., Wang, T.: Single UHD image dehazing via interpretable pyramid network (2022)

    Google Scholar 

  60. Xu, J., et al.: Video dehazing via a multi-range temporal alignment network with physical prior. In: CVPR (2023)

    Google Scholar 

  61. Yang, X., Xu, Z., Luo, J.: Towards perceptual image dehazing by physics-based disentanglement and adversarial training. In: AAAI (2018)

    Google Scholar 

  62. Yang, Y., Wang, C., Liu, R., Zhang, L., Guo, X., Tao, D.: Self-augmented unpaired image dehazing via density and depth decomposition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2037–2046 (2022)

    Google Scholar 

  63. Zhang, X., et al.: Learning to restore hazy video: a new real-world dataset and a new method. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9239–9248 (2021)

    Google Scholar 

  64. Zhang, X., Chen, Q., Ng, R., Koltun, V.: Zoom to learn, learn to zoom. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3762–3770 (2019)

    Google Scholar 

  65. Zhang, Z., Wang, R., Zhang, H., Chen, Y., Zuo, W.: Self-supervised learning for real-world super-resolution from dual zoomed observations. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13678, pp. 610–627. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19797-0_35

    Chapter  Google Scholar 

  66. Zhao, S., Zhang, L., Shen, Y., Zhou, Y.: Refinednet: a weakly supervised refinement framework for single image dehazing. TIP (2021)

    Google Scholar 

  67. Zheng, Q., et al.: Development and validation of a deep learning-based laparoscopic system for improving video quality. IJCARS (2023)

    Google Scholar 

  68. Zheng, Y., Zhan, J., He, S., Dong, J., Du, Y.: Curricular contrastive regularization for physics-aware single image dehazing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5785–5794 (2023)

    Google Scholar 

  69. Zhou, Y., Hu, Z., Xuan, Z., Wang, Y., Hu, X.: Synchronizing detection and removal of smoke in endoscopic images with cyclic consistency adversarial nets. IEEE/ACM Trans. Comput. Biol. Bioinform. 1–12 (2022). https://doi.org/10.1109/TCBB.2022.3204673

  70. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)

    Google Scholar 

  71. Zhu, Q., Mai, J., Shao, L.: A fast single image haze removal algorithm using color attenuation prior. TIP (2015)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant No. 62371164 and No. U22B2035.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Chen .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 14720 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, R. et al. (2025). Self-Supervised Video Desmoking for Laparoscopic Surgery. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15130. Springer, Cham. https://doi.org/10.1007/978-3-031-73220-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73220-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73219-5

  • Online ISBN: 978-3-031-73220-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics