Stochastic Uncertainty Quantification Techniques Fail to Account for Inter-analyst Variability in White Matter Hyperintensity Segmentation | SpringerLink
Skip to main content

Stochastic Uncertainty Quantification Techniques Fail to Account for Inter-analyst Variability in White Matter Hyperintensity Segmentation

  • Conference paper
  • First Online:
Medical Image Understanding and Analysis (MIUA 2024)

Abstract

White Matter Hyperintensities (WMH) are important neuroradiological markers of small vessel disease in brain MRI, with automatic segmentation tasks essential in research and clinical settings to understand their role in individuals’ health. However accurate segmentation of WMH is difficult due to their heterogeneous shape, intensity, size and location. Furthermore, image analysts working on different studies have adopted different approaches for providing accurate WMH segmentations, resulting in high inter-analyst variability. We assess the effectiveness of stochastic uncertainty quantification (UQ) techniques for bridging the variability in approaches and criteria in WMH segmentation. We first train six such techniques on an in-house dataset with two segmentation approaches, and then evaluate performance across three studies unseen by the model when training: Mild Stroke Study 3, the Lothian Birth Cohort 1936 and the WMH Challenge dataset. To aid in our analysis, we introduce two metrics: Uncertainty Inter Rater Overlap (UIRO) and Joint Uncertainty Error Overlap (JUEO). Our results show that changes in analyst policy between datasets dominates the uncertainty in the WMH segmentation task. Crucially, the distribution of segmentations predicted by stochastic models can fail to match the distribution of segmentations provided by analysts who are following approaches that differ from those used during training. We further suggest how to modify the task and cost function to overcome these difficulties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 15729
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/BenjaminPhi5/UQforWMH-vs-InterAnalystVariability.

References

  1. Abdar, M., et al.: A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion 76, 243–297 (2021)

    Article  MATH  Google Scholar 

  2. Abe, T., Buchanan, E.K., Pleiss, G., Zemel, R., Cunningham, J.P.: Deep ensembles work, but are they necessary? Adv. Neural. Inf. Process. Syst. 35, 33646–33660 (2022)

    Google Scholar 

  3. Altman, D.G., Bland, J.M.: Measurement in medicine: the analysis of method comparison studies. J. R. Stat. Soc. Ser. D: Stat. 32(3), 307–317 (1983)

    MATH  Google Scholar 

  4. Balakrishnan, R., del C. Valdes Hernández, M., Farrall, A.J.: Automatic segmentation of white matter hyperintensities from brain magnetic resonance images in the era of deep learning and big data–a systematic review. Comput. Med. Imaging Graph. 88, 101867 (2021)

    Google Scholar 

  5. Begoli, E., Bhattacharya, T., Kusnezov, D.: The need for uncertainty quantification in machine-assisted medical decision making. Nat. Mach. Intell. 1(1), 20–23 (2019)

    Article  MATH  Google Scholar 

  6. Bhat, I., Pluim, J.P., Kuijf, H.J.: Generalized probabilistic U-Net for medical image segmentation. In: Sudre, C.H., et al. (eds.) International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, pp. 113–124. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16749-2_11

  7. Billot, B., et al.: SynthSeg: segmentation of brain MRI scans of any contrast and resolution without retraining. Med. Image Anal. 86, 102789 (2023). https://doi.org/10.1016/j.media.2023.102789

    Article  MATH  Google Scholar 

  8. Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural network. In: International Conference on Machine Learning, pp. 1613–1622. PMLR (2015)

    Google Scholar 

  9. Clancy, U., et al.: Rationale and design of a longitudinal study of cerebral small vessel diseases, clinical and imaging outcomes in patients presenting with mild ischaemic stroke: mild stroke study 3. Eur. Stroke J. 6(1), 81–88 (2021)

    Article  MATH  Google Scholar 

  10. Czolbe, S., Arnavaz, K., Krause, O., Feragen, A.: Is segmentation uncertainty useful? In: Feragen, A., Sommer, S., Schnabel, J., Nielsen, M. (eds.) Information Processing in Medical Imaging: 27th International Conference, IPMI 2021, Virtual Event, 28 June–30 June 2021, Proceedings 27, pp. 715–726. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78191-0_55

  11. Ding, T., et al.: An improved algorithm of white matter hyperintensity detection in elderly adults. NeuroImage: Clinical 25, 102151 (2020)

    Google Scholar 

  12. Fazekas, F., Chawluk, J.B., Alavi, A., Hurtig, H.I., Zimmerman, R.A.: MR signal abnormalities at 1.5 T in Alzheimer’s dementia and normal aging. Am. J. Neuroradiol. 8(3), 421–426 (1987)

    Google Scholar 

  13. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059. PMLR (2016)

    Google Scholar 

  14. Gal, Y., Hron, J., Kendall, A.: Concrete dropout. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  15. Galdran, A., Verjans, J.W., Carneiro, G., González Ballester, M.A.: Multi-head multi-loss model calibration. In: Greenspan, H., et al. (eds.) International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 108–117. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43898-1_11

  16. Gaubert, M., et al.: Performance evaluation of automated white matter hyperintensity segmentation algorithms in a multicenter cohort on cognitive impairment and dementia. Front. Psychiatry 13, 2928 (2023)

    Google Scholar 

  17. Gouw, A.A., et al.: Reliability and sensitivity of visual scales versus volumetry for evaluating white matter hyperintensity progression. Cerebrovasc. Dis. 25(3), 247–253 (2008)

    Article  MATH  Google Scholar 

  18. Griffanti, L., et al.: Classification and characterization of periventricular and deep white matter hyperintensities on MRI: a study in older adults. Neuroimage 170, 174–181 (2018)

    Article  Google Scholar 

  19. Guerrero, R., et al.: White matter hyperintensity and stroke lesion segmentation and differentiation using convolutional neural networks. NeuroImage: Clinical 17, 918–934 (2018)

    Google Scholar 

  20. Han, Z., Zhang, C., Fu, H., Zhou, J.T.: Trusted multi-view classification with dynamic evidential fusion. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2551–2566 (2022)

    Article  MATH  Google Scholar 

  21. Harrison, J., Willes, J., Snoek, J.: Variational Bayesian last layers. In: Fifth Symposium on Advances in Approximate Bayesian Inference (2023)

    Google Scholar 

  22. del C. Valdes Hernández, M.V., et al.: Morphologic, distributional, volumetric, and intensity characterization of periventricular hyperintensities. Am. J. Neuroradiol. 35(1), 55–62 (2014)

    Google Scholar 

  23. Hubin, A., Storvik, G.: Variational inference for Bayesian neural networks under model and parameter uncertainty. arXiv preprint arXiv:2305.00934 (2023)

  24. Iglesias, J.E., Liu, C.Y., Thompson, P.M., Tu, Z.: Robust brain extraction across datasets and comparison with publicly available methods. IEEE Trans. Med. Imaging 30(9), 1617–1634 (2011)

    Article  Google Scholar 

  25. Jungo, A., Balsiger, F., Reyes, M.: Analyzing the quality and challenges of uncertainty estimations for brain tumor segmentation. Front. Neurosci. 14, 282 (2020)

    Article  Google Scholar 

  26. Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  27. Kim, K.W., MacFall, J.R., Payne, M.E.: Classification of white matter lesions on magnetic resonance imaging in elderly persons. Biol. Psychiat. 64(4), 273–280 (2008)

    Article  MATH  Google Scholar 

  28. Kohl, S., et al.: A probabilistic U-Net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  29. Kohl, S.A.A., et al.: A hierarchical probabilistic U-Net for modeling multi-scale ambiguities (2019)

    Google Scholar 

  30. Kuijf, H.J., et al.: Standardized assessment of automatic segmentation of white matter hyperintensities and results of the WMH segmentation challenge. IEEE Trans. Med. Imaging 38(11), 2556–2568 (2019)

    Article  MATH  Google Scholar 

  31. Kushibar, K., Campello, V.M., Moras, L.G., Linardos, A., Radeva, P., Lekadir, K.: Layer ensembles: a single-pass uncertainty estimation in deep learning for segmentation. arXiv preprint arXiv:2203.08878 (2022)

  32. Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  33. Lambert, B., Forbes, F., Doyle, S., Dehaene, H., Dojat, M.: Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis. Artif. Intell. Med. 102830 (2024)

    Google Scholar 

  34. Li, H., Nan, Y., Del Ser, J., Yang, G.: Region-based evidential deep learning to quantify uncertainty and improve robustness of brain tumor segmentation. arXiv preprint arXiv:2208.06038 (2022)

  35. Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)

    Article  MATH  Google Scholar 

  36. Liu, J.Z., et al.: A simple approach to improve single-model deep uncertainty via distance-awareness. J. Mach. Learn. Res. 24, Article no. 42, 1667–1729 (2023)

    Google Scholar 

  37. Maillard, P., et al.: White matter hyperintensity penumbra. Stroke 42(7), 1917–1922 (2011)

    Article  MATH  Google Scholar 

  38. Mojiri Forooshani, P., et al.: Deep Bayesian networks for uncertainty estimation and adversarial resistance of white matter hyperintensity segmentation. Technical report. Wiley Online Library (2022)

    Google Scholar 

  39. Monteiro, M., et al.: Stochastic segmentation networks: modelling spatially correlated aleatoric uncertainty. Adv. Neural Inf. Process. Syst. 33, 12756–12767 (2020)

    Google Scholar 

  40. Mukhoti, J., Kirsch, A., van Amersfoort, J., Torr, P.H., Gal, Y.: Deterministic neural networks with inductive biases capture epistemic and aleatoric uncertainty. arXiv preprint arXiv:2102.11582 (2021)

  41. Mukhoti, J., Kirsch, A., van Amersfoort, J., Torr, P.H., Gal, Y.: Deep deterministic uncertainty: a new simple baseline. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24384–24394 (2023)

    Google Scholar 

  42. Osband, I., et al.: Epistemic neural networks. In: Advances in Neural Information Processing Systems, vol. 36 (2024)

    Google Scholar 

  43. Park, G., Hong, J., Duffy, B.A., Lee, J.M., Kim, H.: White matter hyperintensities segmentation using the ensemble U-Net with multi-scale highlighting foregrounds. Neuroimage 237, 118140 (2021)

    Article  Google Scholar 

  44. Rachmadi, M.F., del C. Valdés-Hernández, M., Makin, S., Wardlaw, J., Komura, T.: Automatic spatial estimation of white matter hyperintensities evolution in brain MRI using disease evolution predictor deep neural networks. Med. Image Anal. 63, 101712 (2020)

    Google Scholar 

  45. Reinhold, J.C., Dewey, B.E., Carass, A., Prince, J.L.: Evaluating the impact of intensity normalization on MR image synthesis. In: Medical Imaging 2019: Image Processing, vol. 10949, pp. 890–898. SPIE (2019)

    Google Scholar 

  46. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  MATH  Google Scholar 

  47. Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  48. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015)

    Google Scholar 

  49. Székely, G.J., Rizzo, M.L.: Energy statistics: a class of statistics based on distances. J. Stat. Plann. Inference 143(8), 1249–1272 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  50. del C. Valdés Hernández, M., et al.: Step-by-step pipeline for segmenting enlarged perivascular spaces from 3D T2-weighted MRI (2018–2023). https://doi.org/10.7488/ds/7486

  51. Vettoruzzo, A., Bouguelia, M.R., Vanschoren, J., Rognvaldsson, T., Santosh, K.: Advances and challenges in meta-learning: a technical review. IEEE Trans. Pattern Anal. Mach. Intell. (2024)

    Google Scholar 

  52. Viviers, C.G., Valiuddin, M.A., van der Sommen, F., et al.: Probabilistic 3D segmentation for aleatoric uncertainty quantification in full 3D medical data. In: Medical Imaging 2023: Computer-Aided Diagnosis, vol. 12465, pp. 341–351. SPIE (2023)

    Google Scholar 

  53. Wardlaw, J.M., et al.: Brain aging, cognition in youth and old age and vascular disease in the Lothian Birth Cohort 1936: rationale, design and methodology of the imaging protocol. Int. J. Stroke 6(6), 547–559 (2011)

    Article  MATH  Google Scholar 

  54. Wardlaw, J.M., et al.: Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration. Lancet Neurol. 12(8), 822–838 (2013)

    Google Scholar 

  55. Wardlaw, J.M., del C. Valdés Hernández, M., Muñoz-Maniega, S.: What are white matter hyperintensities made of? Relevance to vascular cognitive impairment. J. Am. Heart Assoc. 4(6), e001140 (2015)

    Google Scholar 

  56. Wimmer, L., Sale, Y., Hofman, P., Bischl, B., Hüllermeier, E.: Quantifying aleatoric and epistemic uncertainty in machine learning: are conditional entropy and mutual information appropriate measures? In: Uncertainty in Artificial Intelligence, pp. 2282–2292. PMLR (2023)

    Google Scholar 

  57. Zhang, R., Frei, S., Bartlett, P.L.: Trained transformers learn linear models in-context. J. Mach. Learn. Res. 25(49), 1–55 (2024)

    MathSciNet  MATH  Google Scholar 

  58. Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imaging 20(1), 45–57 (2001)

    Article  MATH  Google Scholar 

  59. Zhao, X., et al.: Robust white matter hyperintensity segmentation on unseen domain. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 1047–1051. IEEE (2021)

    Google Scholar 

  60. Zhu, W., et al.: Automatic segmentation of white matter hyperintensities in routine clinical brain MRI by 2D VB-Net: a large-scale study. Front. Aging Neurosci. 14, 915009 (2022)

    Article  Google Scholar 

  61. Zou, K., Chen, Z., Yuan, X., Shen, X., Wang, M., Fu, H.: A review of uncertainty estimation and its application in medical imaging. Meta-Radiol. 100003 (2023)

    Google Scholar 

  62. Zou, K., et al.: EvidenceCap: towards trustworthy medical image segmentation via evidential identity cap. arXiv preprint arXiv:2301.00349 (2023)

Download references

Acknowledgements

BP was funded by the United Kingdom Research and Innovation Centre for Doctoral Training in Biomedical AI Programme scholarships (grant EP/S02431X/1). For the purpose of open access, the author has applied a creative commons attribution (CC BY) licence to any author accepted manuscript version arising. Funding from Row Fogo Charitable Trust (Ref No: AD.ROW4.35. BRO-D.FID3668413), and the UK Medical Research Council (UK Dementia Research Institute at the University of Edinburgh, award number UK DRI-4002;G0700704/84698) are also gratefully acknowledged. M.O.B. gratefully acknowledges funding from: Foundation Leducq Transatlantic Network of Excellence (17 CVD 03); EPSRC grant no. EP/X025705/1; British Heart Foundation and The Alan Turing Institute Cardiovascular Data Science Award (C-10180357); Diabetes UK (20/0006221); Fight for Sight (5137/5138); the SCONe projects funded by Chief Scientist Office, Edinburgh & Lothians Health Foundation, Sight Scotland, the Royal College of Surgeons of Edinburgh, the RS Macdonald Charitable Trust, and Fight For Sight; the Neurii initiative which is a partnership among Eisai Co., Ltd, Gates Ventures, LifeArc and HDR UK. Data collection and processing in the primary studies that provided data were funded by the Wellcome Trust (grant 088134/Z/09/A), the European Union Horizon 2020, PHC-03-15, project No. 666881 SVDs@Target, the Fondation Leducq Transatlantic Network of Excellence for the Study of Perivascular Spaces in Small Vessel Disease, ref no. 16 CVD 05, the Stroke Association, The Alzheimer’s Society UK, the UKRI, and the Scottish Chief Scientist Office through the NHS Lothian Research and Development Department.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ben Philps .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2901 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Philps, B. et al. (2024). Stochastic Uncertainty Quantification Techniques Fail to Account for Inter-analyst Variability in White Matter Hyperintensity Segmentation. In: Yap, M.H., Kendrick, C., Behera, A., Cootes, T., Zwiggelaar, R. (eds) Medical Image Understanding and Analysis. MIUA 2024. Lecture Notes in Computer Science, vol 14859. Springer, Cham. https://doi.org/10.1007/978-3-031-66955-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-66955-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-66954-5

  • Online ISBN: 978-3-031-66955-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics