Abstract
Deep learning models have achieved high performance in medical applications, however, their adoption in clinical practice is hindered due to their black-box nature. Using explainable AI (XAI) in high-stake medical decisions could increase their usability in clinical settings. Self-explainable models, like prototype-based models, can be especially beneficial as they are interpretable by design. However, if the learnt prototypes are of low quality then the prototype-based models are as good as black-box. Having high quality prototypes is a pre-requisite for a truly interpretable model. In this work, we propose a prototype evaluation framework for Coherence (PEF-Coh) for quantitatively evaluating the quality of the prototypes based on domain knowledge. We show the use of PEF-Coh in the context of breast cancer prediction using mammography. Existing works on prototype-based models on breast cancer prediction using mammography have focused on improving the classification performance of prototype-based models compared to black-box models and have evaluated prototype quality through anecdotal evidence. We are the first to go beyond anecdotal evidence and evaluate the quality of the mammography prototypes systematically using our PEF-Coh. Specifically, we apply three state-of-the-art prototype-based models, ProtoPNet, BRAIxProtoPNet++ and PIP-Net on mammography images for breast cancer prediction and evaluate these models w.r.t. i) classification performance, and ii) quality of the prototypes, on three public datasets. Our results show that prototype-based models are competitive with black-box models in terms of classification performance, and achieve a higher score in detecting ROIs. However, the quality of the prototypes are not yet sufficient and can be improved in aspects of relevance, purity and learning a variety of prototypes. We call the XAI community to systematically evaluate the quality of the prototypes to check their true usability in high stake decisions and improve such models further.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
While ROI is most commonly used in the medical domain, we use it as generic term to refer to any annotated part of an image, including object part annotations.
References
Barnett, A.J., et al.: A case-based interpretable deep learning model for classification of mass lesions in digital mammography. Nat. Mach. Intel. 3(12), 1061–1070 (2021)
Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., Su, J.K.: This looks like that: deep learning for interpretable image recognition. Adv. Neural Inf. Process. Syst. 32 (2019)
Cui, C., et al.: The Chinese mammography database (CMMD): an online mammography database with biopsy confirmed types for machine diagnosis of breast. (version 1) [data set] (2021). https://doi.org/10.7937/tcia.eqde-4b16. The Cancer Imaging Archive. Accessed 08 Sept 2023
Gautam, S., Höhne, M.M.C., Hansen, S., Jenssen, R., Kampffmeyer, M.: This looks more like that: enhancing self-explaining models by prototypical relevance propagation. Pattern Recogn. 136, 109172 (2023)
Kim, E., Kim, S., Seo, M., Yoon, S.: Xprotonet: diagnosis in chest radiography with global and local explanations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15719–15728 (2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Nauta, M., Schlötterer, J., van Keulen, M., Seifert, C.: Pip-net: patch-based intuitive prototypes for interpretable image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2744–2753 (2023)
Nauta, M., Seifert, C.: The co-12 recipe for evaluating interpretable part-prototype image classifiers. In: Longo, L. (ed.) World Conference on Explainable Artificial Intelligence, vol. 1901, pp. 397–420. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-44064-9_21
Nauta, M., et al.: From anecdotal evidence to quantitative evaluation methods: a systematic review on evaluating explainable AI. ACM Comput. Surv. 55(13s), 1–42 (2023)
Nauta, M., Van Bree, R., Seifert, C.: Neural prototype trees for interpretable fine-grained image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14933–14943 (2021)
Nguyen, H.T., et al.: Vindr-mammo: a large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography. medRxiv (2022). https://doi.org/10.1101/2022.03.07.22272009
Oh, Y., Park, S., Ye, J.C.: Deep learning Covid-19 features on CXR using limited training data sets. IEEE Trans. Med. Imaging 39(8), 2688–2700 (2020)
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Rymarczyk, D., Struski, Ł, Górszczak, M., Lewandowska, K., Tabor, J., Zieliński, B.: Interpretable image classification with differentiable prototypes assignment. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13672, pp. 351–368. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-19775-8_21
Rymarczyk, D., Struski, Ł., Tabor, J., Zieliński, B.: Protopshare: prototypical parts sharing for similarity discovery in interpretable image classification. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1420–1430 (2021)
Sacha, M., Jura, B., Rymarczyk, D., Struski, Ł, Tabor, J., Zieliński, B.: Interpretability benchmark for evaluating spatial misalignment of prototypical parts explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 1919, pp. 21563–21573 (2024). https://doi.org/10.1609/aaai.v38i19.30154
Sawyer-Lee, R., Gimenez, F., Hoogi, A., Rubin, D.: Curated breast imaging subset of digital database for screening mammography (cbis-ddsm) (version 1) [data set] (2016). https://doi.org/10.7937/K9/TCIA.2016.7O02S9CY. Accessed 28 Apr 2022
Shen, L., Margolies, L.R., Rothstein, J.H., Fluder, E., McBride, R., Sieh, W.: Deep learning to improve breast cancer detection on screening mammography. Sci. Rep. 9(1), 1–12 (2019)
Shen, Y., et al.: An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization. Med. Image Anal. 68, 101908 (2021)
Sickles, E.A., et al.: Acr bi-rads® mammography. ACR BI-RADS® Atlas Breast Imaging Report. Data Syst. 5, 2013 (2013)
Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Wang, C., et al.: An interpretable and accurate deep-learning diagnosis framework modelled with fully and semi-supervised reciprocal learning. IEEE Trans. Med. Imaging 43, 392–404 (2023)
Wang, C., et al.: Knowledge distillation to ensemble global and interpretable prototype-based mammogram classification models. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13433, pp. 14–24. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-16437-8_2
Wang, J., Liu, H., Wang, X., Jing, L.: Interpretable image recognition by constructing transparent embedding space. In: Proceedings of the IEEE/CVF international Conference on Computer Vision, pp. 895–904 (2021)
Wu, J., et al.: Expert identification of visual primitives used by cnns during mammogram classification. In: Medical Imaging 2018: Computer-Aided Diagnosis, vol. 10575, pp. 633–641. SPIE (2018)
Xu-Darme, R., Quénot, G., Chihani, Z., Rousset, M.C.: Sanity checks for patch visualisation in prototype-based image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3690–3695 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pathak, S., Schlötterer, J., Veltman, J., Geerdink, J., van Keulen, M., Seifert, C. (2024). Prototype-Based Interpretable Breast Cancer Prediction Models: Analysis and Challenges. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2153. Springer, Cham. https://doi.org/10.1007/978-3-031-63787-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-63787-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63786-5
Online ISBN: 978-3-031-63787-2
eBook Packages: Computer ScienceComputer Science (R0)