What is it about?
When evaluating Artificial Intelligence Explanations, saliency maps help show what parts of the data are most important in making a decision. These heat maps are then compared to a known ground truth to find the overlap. The more overlap the better. We propose asking users to select areas that they find most important to define graduated maps that humans pay the most attention to. Our benchmark validates that these graduated maps capture different information than the ground truth baseline.
Photo by Brad Stallcup on Unsplash
Why is it important?
We demonstrate how to capture crowdsourced attention in text and image domains. We confirm that these maps contain different information than typical pixel-wise ground truth baselines and also show how they can be used to extract and examine human biases in a dataset.
Read the Original
This page is a summary of: Quantitative Evaluation of Machine Learning Explanations: A Human-Grounded Benchmark, April 2021, ACM (Association for Computing Machinery), DOI: 10.1145/3397481.3450689.
You can read the full text:
Recorded Conference Presentation
This was the recording that was played at the virtual IUI conference in April 2021.
Benchmark's Github Repository
In this paper, we asked Mechanical Turkers to annotate images and text. The collection of annotations are available in our open-source git hub repository.
A thumbnail and summary of this project made for the virtual IUI'21 Conference
The following have contributed to this page