What is it about?

Machine learning models are very popular for decision making. Sometimes, the why behind a decision is not so important. For example, when you ask a voice assistant to set a timer, you do not need to know how it interpreted your voice command. However, sometimes models are used in high-stakes domains, where predictions determine whether you can have a loan, or are likely to be a fraudster. In those cases, it is essential that we are able to check how that decision was made. To analyse or explain such models, various explanation techniques exist. One set of techniques is called feature importance, which provides scores for each characteristic of the model input ("feature"). With these scores, you can examine which of the characteristics of the data are important for the model to make its decision. For example, a feature importance explanation can indicate that the amount of a bank transaction was an important factor for predicting that the transaction is fraudulent. One major problem with feature importance is that it is used as an umbrella term, which contains two types of techniques that have completely different mechanics. As a result, the scores that these techniques (gradient- and ablation-based feature importance) provide can differ significantly. As such, scores are easy to misinterpret when you do not consider which technique was used. It is therefore essential that data science professionals are aware of the existence of both techniques, as well as their corresponding mechanics and affordances. We decided to examine whether data science professionals are aware of the techniques, what their expectations of feature importance in general are, and whether they ascribe more to one of the techniques. We found that our participants had some expectations that can be detrimental, some that were incompatible with other expectations, or not encoded in current techniques.

Featured Image

Why is it important?

Our findings show that, what data science professionals expect or assume, does not always match with what the techniques do. We also saw that some participants had a particularly one-sided view on feature importance. These two categories of misunderstandings may lead to misinterpretations of feature importance scores, which in turn may lead to misinterpretations of the model’s decisions. It is important that we create more awareness regarding the correct interpretation of feature importance explanations, as model decisions can have high impacts on people’s lives, especially those used in high-stakes domains.


This work is the culmination of five years of PhD work on the topic of explainable AI. All insights into the workings of feature importance techniques let me to realize that some users of feature importance techniques, both from industry and academia, inadvertently misinterpreted the meaning of what those techniques really did. So we set out to investigate interpretation in more detail. I hope this work will help both practitioners and researchers to make more informed decisions on explaining their models :).

Dennis Collaris
Technische Universiteit Eindhoven

Read the Original

This page is a summary of: Characterizing Data Scientists’ Mental Models of Local Feature Importance, October 2022, ACM (Association for Computing Machinery), DOI: 10.1145/3546155.3546670.
You can read the full text:




The following have contributed to this page