Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead | Nature Machine Intelligence
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

A preprint version of the article is available at arXiv.

Abstract

Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward is to design models that are inherently interpretable. This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: A fictional depiction of the accuracy–interpretability trade-off.
Fig. 2: Saliency does not explain anything except where the network is looking.
Fig. 3: Image from the authors of ref. 48, indicating that parts of the test image on the left are similar to prototypical parts of training examples.

Similar content being viewed by others

References

  1. Wexler, R. When a computer program keeps you in jail: how computers are harming criminal justice. New York Times (13 June 2017); https://www.nytimes.com/2017/06/13/opinion/how-computers-are-harming-criminal-justice.html

  2. McGough, M. How bad is Sacramento’s air, exactly? Google results appear at odds with reality, some say. Sacramento Bee (7 August 2018); https://www.sacbee.com/news/state/california/fires/article216227775.html

  3. Varshney, K. R. & Alemzadeh, H. On the safety of machine learning: cyber-physical systems, decision sciences and data products. Big Data 10, 5 (2016).

    Google Scholar 

  4. Freitas, A. A. Comprehensible classification models: a position paper. ACM SIGKDD Explorations Newsletter 15, 1–10 (2014).

    Article  Google Scholar 

  5. Kodratoff, Y. The comprehensibility manifesto. KDD Nugget Newsletter https://www.kdnuggets.com/news/94/n9.txt (1994).

  6. Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J. & Baesens, B. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Syst. 51, 141–154 (2011).

    Article  Google Scholar 

  7. Rüping, S. Learning Interpretable Models. PhD thesis, Univ. Dortmund (2006).

  8. Gupta, M. et al. Monotonic calibrated interpolated look-up tables. J. Mach. Learn. Res. 17, 1–47 (2016).

    MathSciNet  MATH  Google Scholar 

  9. Lou, Y., Caruana, R., Gehrke, J. & Hooker, G. Accurate intelligible models with pairwise interactions. In Proceedings of 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 623–631 (ACM, 2013).

  10. Miller, G. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63, 81–97 (1956).

    Article  Google Scholar 

  11. Cowan, N. The magical mystery four: How is working memory capacity limited, and why? Curr. Dir. Psychol. Sci. 19, 51–57 (2010).

    Article  Google Scholar 

  12. Wang, J., Oh, J., Wang, H. & Wiens, J. Learning credible models. In Proceedings of 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2417–2426 (ACM, 2018).

  13. Rudin, C. Please stop explaining black box models for high stakes decisions. In Proceedings of NeurIPS 2018 Workshop on Critiquing and Correcting Trends in Machine Learning (NIPS, 2018).

  14. Holte, R. C. Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63–91 (1993).

    Article  Google Scholar 

  15. Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. From data mining to knowledge discovery in databases. AI Magazine 17, 37–54 (1996).

    Google Scholar 

  16. Chapman, P. et al. CRISP-DM 1.0—Step-by-Step Data Mining Guide (SPSS, 2000).

  17. Agrawal, D. et al. Challenges and Opportunities with Big Data: A White Paper Prepared for the Computing Community Consortium Committee of the Computing Research Association (CCC, 2012); http://cra.org/ccc/resources/ccc-led-whitepapers/

  18. Defense Advanced Research Projects Agency. Broad Agency Announcement, Explainable Artificial Intelligence (XAI), DARPA-BAA-16-53 (DARPA, 2016); https://www.darpa.mil/attachments/DARPA-BAA-16-53.pdf

  19. Hand, D. Classifier technology and the illusion of progress. Statist. Sci. 21, 1–14 (2006).

    Article  MathSciNet  Google Scholar 

  20. Rudin, C. et al. A process for predicting manhole events in Manhattan. Mach. Learn. 80, 1–31 (2010).

    Article  MathSciNet  Google Scholar 

  21. Rudin, C. & Ustun, B. Optimized scoring systems: toward trust in machine learning for healthcare and criminal justice. Interfaces. 48, 399–486 (2018). Special Issue: 2017 Daniel H. Wagner Prize for Excellence in Operations Research Practice September–October 2018.

    Article  Google Scholar 

  22. Chen, C. et al. An interpretable model with globally consistent explanations for credit risk. In Proceedings of NeurIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy (NIPS, 2018).

  23. Mittelstadt, B., Russell, C. & Wachter, S. Explaining explanations in AI. In Proceedings of Fairness, Accountability, and Transparency (FAT*) (ACM, 2019).

  24. Flores, A. W., Lowenkamp, C. T. & Bechtel, K. False positives, false negatives, and false analyses: a rejoinder to ‘machine bias: there’s software used across the country to predict future criminals’. Fed. Probat. J. 80, 38–46 (2016).

    Google Scholar 

  25. Angwin, J., Larson, J., Mattu, S. & Kirchner, L. Machine bias. ProPublica (2016); https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  26. Larson, J., Mattu, S., Kirchner, L. & Angwin, J. How we analyzed the COMPAS recidivism algorithm. ProPublica (2016); https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

  27. Rudin, C., Wang, C.& Coker, B. The age of secrecy and unfairness in recidivism prediction. Preprint at https://arxiv.org/abs/1811.00731 (2018).

  28. Brennan, T., Dieterich, W. & Ehret, B. Evaluating the predictive validity of the COMPAS risk and needs assessment system. Crim. Justice Behav. 36, 21–40 (2009).

    Article  Google Scholar 

  29. Zeng, J., Ustun, B. & Rudin, C. Interpretable classification models for recidivism prediction. J. R. Stat. Soc. A Stat. Soc. 180, 689–722 (2017).

    Article  MathSciNet  Google Scholar 

  30. Tollenaar, N. & van der Heijden, P. G. M. Which method predicts recidivism best? A comparison of statistical, machine learning and data mining predictive models. J. R. Stat. Soc. Ser. A Stat. Soc. 176, 565–584 (2013).

    Article  MathSciNet  Google Scholar 

  31. Mannshardt, E. & Naess, L. Air quality in the USA. Significance 15, 24–27 (October, 2018).

  32. Zech, J. R. et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med. 15, e1002683 (2018).

    Article  Google Scholar 

  33. Chang, A., Rudin, C., Cavaretta, M., Thomas, R. & Chou, G. How to reverse-engineer quality rankings. Mach. Learn. 88, 369–398 (2012).

    Article  MathSciNet  Google Scholar 

  34. Goodman, B. & Flaxman, S. EU regulations on algorithmic decision-making and a ‘right to explanation’. AI Magazine 38, 3 (2017).

    Article  Google Scholar 

  35. Wachter, S., Mittelstadt, B. & Russell, C. Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard Journal of Law & Technology 1 (2018).

  36. Quinlan, J. R. C4. 5: Programs for Machine Learning Vol. 1 (Morgan Kaufmann, 1993).

  37. Breiman, L., Friedman, J., Stone, C. J. & Olshen, R. A. Classification and Regression Trees (CRC Press, 1984).

  38. Auer, P., Holte, R. C. & Maass, W. Theory and applications of agnostic PAC-learning with small decision trees. In Proceedings of 12th International Conference on Machine Learning 21–29 (Morgan Kaufmann, 1995).

  39. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M. & Rudin, C. Certifiably optimal rule lists for categorical data. J. Mach. Learn. Res. 19, 1–79 (2018).

    MathSciNet  MATH  Google Scholar 

  40. Wang, F. & Rudin, C. Falling rule lists. In Proceedings of Machine Learning Research Vol. 38: Artificial Intelligence and Statistics 1013–1022 (PMLR, 2015).

  41. Chen, C. & Rudin, C. An optimization approach to learning falling rule lists. In Proceedings of Machine Learning Research Vol. 84 : Artificial Intelligence and Statistics 604–612 (PMLR, 2018).

  42. Hu, X. (S.), Rudin, C. & Seltzer, M. Optimal sparse decision trees. Preprint at https://arxiv.org/abs/1904.12847 (2019).

  43. Burgess, E. W. Factors Determining Success or Failure on Parole (Illinois Committee on Indeterminate-Sentence Law and Parole, 1928).

  44. Carrizosa, E., Martn-Barragán, B. & Morales, D. R. Binarized support vector machines. INFORMS J. Comput. 22, 154–167 (2010).

    Article  MathSciNet  Google Scholar 

  45. Sokolovska, N., Chevaleyre, Y. & Zucker, J. D. A provable algorithm for learning interpretable scoring systems. In Proceedings of Machine Learning Research Vol. 84: Artificial Intelligence and Statistics 566–574 (PMLR, 2018).

  46. Ustun, B. & Rudin, C. Optimized risk scores. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2017).

  47. Ustun, B. et al. The World Health Organization adult attention-deficit/hyperactivity disorder self-report screening scale for DSM-5. JAMA Psychiatr. 74, 520–526 (2017).

    Article  Google Scholar 

  48. Chen, C. et al. This looks like that: deep learning for interpretable image recognition. Preprint at https://arxiv.org/abs/1806.10574 (2018).

  49. Li, O., Liu, H., Chen, C. & Rudin, C. Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions. In Proceedings of AAAI Conference on Artificial Intelligence 3530–3537 (AAAI, 2018).

  50. Gallagher, N. et al. Cross-spectral factor analysis. In Proceedings of Advances in Neural Information Processing Systems 30 (NeurIPS) 6842–6852 (Curran Associates, 2017).

  51. Wang, F., Rudin, C., McCormick, T. H. & Gore, J. L. Modeling recovery curves with application to prostatectomy. Biostatistics https://doi.org/10.1093/biostatistics/kxy002 (2018).

  52. Lou, Y., Caruana, R. & Gehrke, J. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2012).

Download references

Acknowledgements

The author thanks F. Wang, T. Wang, C. Chen, O. Li, A. Barnett, T. Dietterich, M. Seltzer, E. Angelino, N. Larus-Stone, E. Mannshart, M. Gupta and several others who helped my thought processes in various ways, and particularly B. Ustun, R. Parr, R. Holte and my father, S. Rudin, who went to considerable efforts to provide thoughtful comments and discussion. The author acknowledges funding from the Laura and John Arnold Foundation, NIH, NSF, DARPA, the Lord Foundation of North Carolina and MIT-Lincoln Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cynthia Rudin.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1, 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-019-0048-x

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics