Understanding LIME in 5 steps
Leveraging the power of explanations to build trustworthy ML models
7 minute read · March 1, 2022, 1:09 PM
Being able to produce explainable predictions lies at the center of trustworthy machine learning (ML). LIME, which is short for Local Interpretable Model-agnostic Explanations, is a technique that was created by Marco Tulio Ribeiro et al. at the University of Washington and can be used to explain the predictions made by any black-box classifier.
In this post, we start by discussing the use of explanations in ML. Then, we go through the five steps followed by LIME to explain a particular model prediction. Finally, we illustrate a simple use of LIME in fraud detection and give pointers to references for further exploration, if this is a topic you are interested in.
Join thousands of practitioners and enthusiasts learning the secrets of building performant and explainable ML!
If you want to start understanding your models right away, feel free to skip this post and go ahead to check out Unbox!
Why are explanations useful?
When choosing between different ML models that solve a specific task, there is often a trade-off between interpretability and predictive performance.
The consequences of such a trade-off reflect directly in the types of ML approaches chosen by the industry. In highly regulated fields, for example, where there must always be a way to justify the model’s predictions, interpretable models, such as shallow decision trees or linear models with sparse features, are often chosen over more powerful alternatives.
LIME is a tool that is applicable to any black-box classifier. Its usefulness is based on the fact that instead of having to work only with intrinsically interpretable models and their limited capabilities, one can use any classifier and then generate explanations for individual predictions afterward, thus benefiting from the best of both worlds.
LIME in five steps
To construct the explanations, LIME feeds the black-box model with small variations of the original data sample and probes how the model’s predictions change. From these variations, it learns an interpretable model which approximates the black-box classifier in the vicinity of the original data sample. Locally, the interpretable model provides a faithful approximation of the black-box model, even though it is likely not a globally good approximator.
Don’t worry if you didn’t understand everything by just reading the previous description. We now go through the five steps followed by LIME to generate an explanation.
Step 1: Choosing the sample to be explained
Consider the binary classification task from two features. Furthermore, let’s say we have a black-box classifier illustrated below, where class 0 is in red and class 1 is in blue.
The first step using LIME is choosing the data sample to be explained. For example, this could be a point from your validation set that your model mispredicted or, really, any sample that you are particularly interested in investigating further.
In this case, we are interested in understanding why the black-box classifier predicted that a specific sample, x', belongs to class 0.
Step 2: Perturbing the sample
Now, the actual LIME algorithm kicks in. It starts by randomly perturbing the data sample x', which results in multiple new samples, depicted as the black dots in the image below.
Step 3: Labelling the perturbed samples with the black-box model
Then, these perturbed samples are fed to the black-box model, to see how it would predict them. These labeled samples will serve as a dataset to train an interpretable model.
Step 4: Weighting
There is an additional catch. Since one of the premises that LIME builds upon is that the interpretable model can be a good local approximation, we are mostly interested in the perturbed samples that are close to x'. Therefore, to learn the interpretable model, LIME weights the samples according to their proximity to x', so samples close to x' are given a large weight and samples far away from x' are given a small weight.
Step 5: Learning the interpretable model
Now, LIME is ready to learn the interpretable model!
The default interpretable model used by LIME is a linear model. It is learned from the weighted samples and provides explanations for a particular prediction, in this case, x'. Even though the black-box model might be nonlinear, we expect that close to x', the decision boundary can be reasonably approximated by a line.
A linear model with sparse features, such as the one learned by LIME, is intrinsically interpretable. The model’s weights can be inspected and seen as proxies for feature importance. Therefore, what LIME returns as the feature scores is actually the linear model’s weights.
For the sake of simplicity, in this post, we focused on giving a high-level overview of how LIME works for tabular data with only two features, avoiding the mathematical details from the problem formulation. The generalization from two features to many is direct, we just wouldn’t be able to draw nice figures.
Moreover, LIME works not only for tabular data but also for text and image data types. There are a few modifications required so that the data representation over which the explanations are provided remains easily interpretable to humans. All of these modifications are discussed in more detail in the Interpretable Data Representations section from the original paper and the interested reader is encouraged to check it out.
Fraud detection with LIME
LIME is a groundbreaking library that changed the landscape of ML interpretability. Therefore, we decided to offer it as one of the built-in explainability options at Unbox. We had to perform optimizations, ranging from mathematical to algorithmic modifications so that users can have a performant experience whilst enjoying the interpretability quality provided by LIME.
Let’s go through a simple example that illustrates the usefulness of explainability in a real-world situation.
Imagine you have a model that detects credit card frauds. Your model might have flagged a few transactions as fraudulent, when in fact, they were normal, and this is annoying some of your users who are frustrated they can’t frictionlessly shop online. Hence, you are interested in understanding a few of your model’s mispredictions.
You can upload your model and validation set to Unbox and browse through your dataset, looking for samples that exhibit this behavior.
We found one! Let’s click on it to start understanding why the model predicted this transaction as fraudulent, when in fact, it wasn’t.
What we see above are the explainability scores computed with LIME. As we can observe, for this particular example, the feature that contributed the most to the misprediction (shown in strong red) is the amount of the transaction (named
This makes sense! Fraudulent transactions are usually associated with higher amounts, like the one observed in this particular sample.
Just to make sure our hypothesis makes sense, we can quickly filter our dataset according to the transaction amount (
amt) and look only at the data samples with high amounts, for example, between 600 and 10,000.
Indeed! There are only 4 samples with amounts within the selected range and the model mistakenly thought all of them were fraudulent transactions!
From this insight, there are different paths that you might choose to follow. One possibility is accepting that your model is doing a reasonable thing; after all, in a fraud detection system, it is much cheaper to deal with a few false positives than with the false negatives. Another option is concluding that this type of transaction is underrepresented in your dataset, so you might want to collect or generate more data that looks like it to boost your model’s performance.
The possibilities are endless, but notice that actionable insights arise naturally once we leverage the power of explanations. Moreover, explanations build trust that your model is behaving as expected and not simply over-indexing to certain features, for instance.
Interpretability and explainability are active areas of research. Even though the current techniques are already proving to be useful, there are still several open questions to be resolved. At Unbox, we also offer SHAP (which stands for SHapley Additive exPlanations) as an alternative to LIME. In this blog post, we explain its foundations.
In our white paper, we also discuss the use of global and local explanations as a way to build trust in ML models.
The book “Interpretable machine learning: a guide for making black-box models explainable” by Christoph Molnar is a great resource online that covers many practical aspects surrounding the topic. If you are feeling more philosophical and want to start thinking from first principles about interpretability, Zachary Lipton’s “The mythos of model interpretability” is a fantastic place to start.