Atharva Joshi Data Scientist | AWS Certified Machine Learning Specialist | Finalist at the 9th CSI National Student Project Awards 2020 | Entrepreneur

A Comprehensive Guide to Machine Learning Interpretability

3 min read

Machine learning has seen advancements spreading from classical ML, image classification, etc. to the currently trending domains like text analytics, Natural Language Understanding (NLU), and Natural Language Processing(NLP).

Even after so many developments and research, Machine Learning continues to be seen as a black box where least is known about why the trained model behaves the way it does. The question about how the machine predicts can be answered by explaining all the mathematical formulae being applied at the back; the question about why it predicts in such a way, however, is still the fact least known.

As autonomous machines and black-box algorithms begin making decisions previously entrusted to humans, it becomes necessary for these mechanisms to explain themselves. Despite their success in a broad range of tasks including advertising, movie and book recommendations, and mortgage qualification, there is general mistrust about their results.

Machine Learning Interpretability must usually take place after the model is trained and fit. Hence, most interpretability functions take a fitted model as the primary and mandatory argument.

This article ahead is an outcome of about 2 weeks long googling and research for my data science project at work. Hence, I believe that post ahead might be useful and may give a kickstart insight to my fellow data scientists and machine learning enthusiasts (and may even save a lot of time 😉)

Important libraries for ML Interpretability

There are numerous libraries that prove to be immensely useful for beginning to understand interpretability.


Explain Like I am 5. Sounds interesting right? This library gives you an insight into the importance and weight your model gives to different features in your dataset. It has different functions/methods for different kinds of machine learning algorithms-supervised or unsupervised. For example, if the model being trained is a RandomForestRegressor, your code for interpretability might look like the following snippet.

# explain tree regressor feature importance# show predictions
eli5.show_prediction(model, X_test.iloc[10], show_feature_values=True)eli5.sklearn.explain_prediction.explain_prediction_tree_regressor(model, doc=X_train.values[randint(0, 100)], feature_names=X_train.columns.tolist()))

Here, ‘model’ is the trained and fit model of any kind, X_test.iloc[10] is any row in the test set. The intuition behind this code is that you can get an insight of the prediction for any particular test set entry and figure out which features are important for that prediction for the trained model.

Refer documentation for detailed code snippets:

An example of the output of explain_prediction() method for a dataset.


Although using tools as we discussed earlier, we can know which feature is significantly influencing the outcome based on the importance calculation, it really sucks that we don’t know in which direction it is influencing. And in most real cases, the effect is non-monotonic. We need some powerful tools to help understand the complex relations between predictors and model prediction — and PDPBox fulfills just that. It is an extremely useful tool to measure how the confidence of prediction due to a particular feature varies with varying values of that feature.

pdp_goals = pdp.pdp_isolate(model=self.model, dataset=self.X_train, model_features=self.base_features,feature=b_feature)pdp.pdp_plot(pdp_goals, b_feature)

Here, b_feature is any feature from the list of all base/dependent/non-target features.

You can refer the tool documentation here-


SHAP plots take into account the shapely values of different features. Shapley values are a widely used approach from cooperative game theory that come with desirable properties. With this clear, we can plot numerous plots including but not limited to beeswarm plot, waterfall plot, decision plot, etc.

Each of these plots has significance in the model interpretation process. Refer for detailed documentation.

Fig: Different Shap plots from the dataset ( in order starting from top left in clockwise direction are beeswarm, heatmap, bar and scatter plots.)


Yellowbrick is another interpretability tool from the arsenal and extends the Scikit-Learn API to make model selection and hyperparameter tuning easier. Under the hood, it’s using Matplotlib. It can support visualizations needed for different interpretation approaches based on the model being analyzed. It has different visualizations for Regression and Classification models which prove to be useful for their respective behavior analyses.

Different visualizations from the yellowbrick library.

Why interpretability is and should be an important part of the process?

Until a certain time ago, even I used to wonder why it would be so essential and a topic of interest to so many data scientists, machine learning engineers, etc. What difference does it make without even considering why the machine is predicting like that? Why not just let it be a black box? I happened to get these answers only while actually working much on the models and datasets.

From the difference I experience in the final product, the answer to all these questions is very simple – Accuracy and Business Value Addition.

Even though we place a major part of predictions in the hands of our machine learning code, it is important to know what actually is benefitting the prediction and what isn’t. But this process of knowing what’s happening under the hood is tedious and time-consuming. Hence, interpretability peels one extra layer of the process and lets us peek into the extra layer deep.

This, in turn, accelerates the product development process because, now, the developer has had a glance over what’s helping in the process and what’s not — creating an immense scope of better and faster development, in turn initiating a vast inflow of business revenue and results in a significant business value addition.

If you liked reading this, please share it forward.

For the detailed codes from my project, refer to my Github profile:

Follow me on LinkedIn here:

DPhi is now AI Planet – an ecosystem envisioned…

A couple of years back, we conceptualized the name “DPhi” for our new adventure. With a geeky touch to it, the name DPhi resonated...
Chanukya Patnaik
1 min read

Celebrating Graduates of AI Planet (formerly DPhi) Summer ’22…

The summer of 2022 was an eventful one for the community on AI Planet (formerly DPhi) as the community members enjoyed a plethora of...
Mudit Srivastava
1 min read

Celebrating Graduates of Data Science and Deep Learning Bootcamps…

AI Planet collaborated with IIT Bombay’s Analytics Club to organize two bootcamps that served as hands-on courses by data science experts from the industry....
Mudit Srivastava
57 sec read

Leave a Reply

Your email address will not be published.