What is it about?

Machine learning is being deployed in complex, real-world scenarios where errors can lead to fatal accidents, such as autonomous vehicles striking pedestrians. Unfortunately, the data used to train models in these scenarios can be rife with errors, which can lead to downstream safety risks in trained models. To address these issues, we propose a new abstraction, learned observation assertions, and implement it in a system called Fixy. Fixy learns feature distributions that specify likely and unlikely values (e.g., that a speed of 30mph is likely but 300mph is unlikely) and uses these feature distributions to score labels for potential errors. We show that Fixy can automatically rank potential errors in real datasets with up to 2x higher precision compared to recent work on model assertions and standard techniques such as uncertainty sampling. Furthermore, Fixy can uncover labeling errors in 70% of scenes in a popular autonomous vehicle dataset.

Featured Image

Why is it important?

Errors in ML pipelines have already caused fatal accidents. We propose a system to help find errors in these ML pipelines to help with the reliability of ML.

Read the Original

This page is a summary of: Finding Label and Model Errors in Perception Data With Learned Observation Assertions, June 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3514221.3517907.
You can read the full text:

Read

Contributors

The following have contributed to this page