What is it about?

Shannon Entropy introduced by Claude Shannon over 70 years ago is a longstanding concept widely used in various areas, ranging from cryptography and linguistics to informatics and telecommunications. It allows measuring the information content of digital signals in various applications. This paper introduces a way of deploying this concept in automated learning with data outliers for a broad class of methods from machine learning and artificial intelligence, allowing us to find a simple analytical solution to the problem of outliers and anomaly learning. It allows a significant reduction of computational costs and increases the quality of learning model predictions.

Featured Image

Why is it important?

For example, when dealing with the human organism and diseases, the number of variables and patient characteristics – known and unknown – can easily outnumber the available data to analyze. Moreover, especially in biomedical applications, available data is frequently contaminated with anomalies, outliers, mismeasurements, and mislabeling. This makes learning from such data, particularly challenging.

Perspectives

The idea behind the computational strategy proposed in the paper was shown to improve learning from data and the accuracy of predictions when in presence of data anomalies and outliers, by exploiting the potential of novel math-driven learning and mathematical methods. A field with huge potential for adopting this sort of strategy is that of biomedicine and healthcare. For instance, the large yet unraveled potential of such methods is in improving diagnostics of cardiovascular diseases (CVDs): according to the World Health Organisation, CVDs are responsible for approximately one-third of the global mortality, yearly accounting for around 18 million deaths globally

illia horenko
Universita della Svizzera Italiana

Read the Original

This page is a summary of: Cheap robust learning of data anomalies with analytically solvable entropic outlier sparsification, Proceedings of the National Academy of Sciences, February 2022, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2119659119.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page