What is it about?
Modern deep neural networks for image classification have achieved superhuman performance. Yet, the complex details of trained networks have forced most practitioners and researchers to regard them as black boxes with little that could be understood. This paper considers in detail a now-standard training methodology: driving the cross-entropy loss to zero, continuing long after the classification error is already zero. Applying this methodology to an authoritative collection of standard deepnets and datasets, we observe the emergence of a simple and highly symmetric geometry of the deepnet features and of the deepnet classifier, and we document important benefits that the geometry conveys—thereby helping us understand an important component of the modern deep learning training paradigm.
Featured Image
Why is it important?
The standard workflow of empirical deep learning can be viewed as a series of arbitrary steps that happened to help win prediction challenge contests, which were then proliferated by their popularity among contest practitioners. Careful analysis, providing a full understanding of the effects and benefits of each workflow component, was never the point. One of the standard workflow practices is training beyond zero error to zero loss (i.e., TPT). In this work, we give a clear understanding that TPT benefits today’s standard deep learning training paradigm by showing how it leads to the pervasive phenomenon of NC. Moreover, this work puts older results on a different footing, expanding our understanding of their contributions. Finally, because of the precise mathematics and geometry, the doors are open for new formal insights.
Read the Original
This page is a summary of: Prevalence of neural collapse during the terminal phase of deep learning training, Proceedings of the National Academy of Sciences, September 2020, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2015509117.
You can read the full text:
Contributors
The following have contributed to this page







