Interpretable Machine Learning

A Guide for Making Black Box Models Explainable

Author

Christoph Molnar

About the Book

Summary

Machine learning is part of our products, processes, and research. But computers usually don’t explain their predictions, which can cause many problems, ranging from trust issues to undetected bugs. This book is about making machine learning models and their decisions interpretable.

After exploring the concepts of interpretability, you will learn about simple, interpretable models such as decision trees and linear regression. The focus of the book is on model-agnostic methods for interpreting black box models. Some model-agnostic methods like LIME and Shapley values can be used to explain individual predictions, while other methods like permutation feature importance and accumulated local effects can be used to get insights about the more general relations between features and predictions. In addition, the book presents methods specific to deep neural networks.

All interpretation methods are explained in depth and discussed critically. How do they work? What are their strengths and weaknesses? How do you interpret them? This book will enable you to select and correctly apply the interpretation method that is most suitable for your machine learning application. Reading the book is recommended for machine learning practitioners, data scientists, statisticians, and anyone else interested in making machine learning models interpretable.

Why I wrote the book

This book began as a side project while I was working as a statistician in clinical research. On my free day, I explored topics that interested me, and interpretable machine learning eventually caught my focus. Expecting plenty of resources on interpreting machine learning models, I was surprised to find only scattered research papers and blog posts, with no comprehensive guide. This motivated me to create the resource I wished existed – a book to deepen my own understanding and share insights with others. Today this book is a go-to resource for interpreting machine learning models. Researchers have cited the book thousands of times, students messaged me that it was essential to their theses, instructors use it in their classes, and data scientists in industry rely on it for their daily work and recommend it to their colleagues. The book has also been the foundation of my own career; first, it inspired me to do a PhD on interpretable machine learning, and later it encouraged me to become a self-employed writer, educator, and consultant.

Who this book is for

This book is for practitioners looking for an overview of techniques to make machine learning models more interpretable. It’s also valuable for students, teachers, researchers, and anyone interested in the topic. A basic understanding of machine learning and basic university-level math will help in following the theory, but the intuitive explanations at the start of each chapter should be accessible without math knowledge.

What’s new in the 3rd edition?

The 3rd edition is both a small and a big update. It’s a small update because I only added two new method chapters, namely LOFO and Ceteris Paribus, and two introductory chapters: Methods Overview and Goals of Interpretability). However, I also made some bigger changes to the book that are more subtle, but which I believe improve the book. I reorganized the introduction part so that it’s leaner, yet more insightful. Further, I gave the data examples more depth (e.g., studying correlations and doing more nuanced analysis), and replaced the cancer dataset with the more accessible Palmer penguin dataset. To make the book more practical, I introduced tips and warning boxes to help interpreting machine learning models the right way. A huge change in the 3rd edition was also cleaning up the book’s repository and rendering the book with Quarto instead of bookdown. For you as the reader, this is only visible in the appearance of the web version of the book, but it also means the book is now much easier to maintain for me, and this will benefit the Interpretable Machine Learning book in the long run. I also fixed a lot of small things, which you can see in the book repository’s README, section “Changelog”.

About the author

Hi! My name is Christoph Molnar. I write and teach about machine learning, specifically topics that go beyond merely predictive performance. I studied statistics, worked for a few years as a data scientist, did my PhD on interpretable machine learning, and now I’m a writer and also offer workshops and consulting. To stay up to date with my work on machine learning, you can subscribe to my newsletter Mindful Modeler.

This book is licensed under the CC BY-NC-SA 4.0 license.