Abstract
Purpose
For many applications in the field of computer-assisted surgery, such as providing the position of a tumor, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery, methods for surgical workflow analysis are a prerequisite. Often machine learning-based approaches serve as basis for analyzing the surgical workflow. In general, machine learning algorithms, such as convolutional neural networks (CNN), require large amounts of labeled data. While data is often available in abundance, many tasks in surgical workflow analysis need annotations by domain experts, making it difficult to obtain a sufficient amount of annotations.
Methods
The aim of using active learning to train a machine learning model is to reduce the annotation effort. Active learning methods determine which unlabeled data points would provide the most information according to some metric, such as prediction uncertainty. Experts will then be asked to only annotate these data points. The model is then retrained with the new data and used to select further data for annotation. Recently, active learning has been applied to CNN by means of deep Bayesian networks (DBN). These networks make it possible to assign uncertainties to predictions. In this paper, we present a DBN-based active learning approach adapted for image-based surgical workflow analysis task. Furthermore, by using a recurrent architecture, we extend this network to video-based surgical workflow analysis. To decide which data points should be labeled next, we explore and compare different metrics for expressing uncertainty.
Results
We evaluate these approaches and compare different metrics on the Cholec80 dataset by performing instrument presence detection and surgical phase segmentation. Here we are able to show that using a DBN-based active learning approach for selecting what data points to annotate next can significantly outperform a baseline based on randomly selecting data points. In particular, metrics such as entropy and variation ratio perform consistently on the different tasks.
Conclusion
We show that using DBN-based active learning strategies make it possible to selectively annotate data, thereby reducing the required amount of labeled training in surgical workflow-related tasks.



Similar content being viewed by others
References
Aksamentov I, Twinanda AP, Mutter D, Marescaux J, Padoy N (2017) Deep neural networks predict remaining surgery duration from cholecystectomy videos. In: MICCAI. Springer, pp 586–593
Bodenstedt S, Wagner M, Katić D, Mietkowski P, Mayer B, Kenngott H, Müller-Stich B, Dillmann R, Speidel S (2017) Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis. arXiv preprint arXiv:1702.03684
Chen W, Feng J, Lu J, Zhou J (2018) Endo3d: online workflow analysis for endoscopic surgeries based on 3d cnn and lstm. In: Computer assisted robotic endoscopy. Springer, pp 97–107 (2018)
Cohn DA, Ghahramani Z, Jordan MI (1996) Active learning with statistical models. J Artif Intell Res 4:129–145
Deal SB, Lendvay TS, Haque MI, Brand T, Comstock B, Warren J, Alseidi A (2016) Crowd-sourced assessment of technical skills: an opportunity for improvement in the assessment of laparoscopic surgical skills. Am J Surg 211(2):398–404
Funke I, Jenke A, Mees ST, Weitz J, Speidel S, Bodenstedt S (2018) Temporal coherence-based self-supervised learning for laparoscopic workflow analysis. In: First international workshop, OR 2.0. Springer, p 85 (2018)
Gal Y (2016) Uncertainty in deep learning. University of Cambridge, Cambridge
Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In: NIPS, pp 1019–1027
Gal Y, Islam R, Ghahramani Z (2017) Deep Bayesian active learning with image data. In: ICML
Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. In: ICML (2017)
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jin Y, Dou Q, Chen H, Yu L, Qin J, Fu CW, Heng PA (2018) SV-Rcnet: Workflow recognition from surgical videos using recurrent convolutional network. IEEE Trans Med Imaging 37(5):1114–1126
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems, vol 25. Curran Associates, Inc., Red Hook, pp 1097–1105
Lalys F, Jannin P (2014) Surgical process modelling: a review. Int J Comput Assist Radiol Surg 9(3):495–511
Maier-Hein L, Kondermann D, Roß T, Mersmann S, Heim E, Bodenstedt S et al (2015) Crowdtruth validation: a new paradigm for validating algorithms that rely on image correspondences. Int J Comput Assist Radiol Surg 10(8):1201–1212
Maier-Hein L, Mersmann S, Kondermann D, Bodenstedt S, Sanchez A, Stock C, Kenngott HG, Eisenmann M, Speidel S (2014) Can masses of non-experts train highly accurate image classifiers? In: MICCAI. Springer, pp 438–445 (2014)
Maier-Hein L, Ross T, Gröhl J, Glocker B, Bodenstedt S et al (2016) Crowd-algorithm collaboration for large-scale endoscopic image annotation with confidence. In: MICCAI. Springer, pp 616–623 (2016)
Milletari F, Navab N, Ahmadi SA (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth international conference on 3D vision (3DV). IEEE, pp 565–571 (2016)
Ross T, Zimmerer D, Vemuri A, Isensee F, Wiesenfarth M, Bodenstedt S, Both F, Kessler P, Wagner M, Müller B et al (2018) Exploiting the potential of unlabeled endoscopic video data with self-supervised learning. Int J Comput Assist Radiol Surg 13:925–933
Twinanda AP, Shehata S, Mutter D, Marescaux J, De Mathelin M, Padoy N (2017) Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36(1):86–97
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83. http://www.jstor.org/stable/3001968
Yengera G, Mutter D, Marescaux J, Padoy N (2018) Less is more: surgical phase recognition with less annotations through self-supervised pre-training of cnn-lstm networks. arXiv preprint arXiv:1805.08569
Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018)Deepphase: surgical phase recognition in cataracts videos. In: MICCAI. Springer, pp 265–272
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
For this type of study, formal consent is not required.
Informed consent
This article contains patient data from publicly available datasets.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bodenstedt, S., Rivoir, D., Jenke, A. et al. Active learning using deep Bayesian networks for surgical workflow analysis. Int J CARS 14, 1079–1087 (2019). https://doi.org/10.1007/s11548-019-01963-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-019-01963-9