What is it about?

We study the effect of TanH, ReLU, and ELU activation functions on the neural network (NN) loss landscapes. We observe that the activation functions have no effect on the number of local minima, but significantly change the connectivity between them. ELU yields superior generalisation performance. As a bonus, we show that narrow valleys in the NN landscape contain (1) saturated neurons and (2) self-regularised solutions.

Featured Image

Why is it important?

Previous research assumed that solutions that generalise well can be found in flat minima. We challenge this assumption. We also provide an explanation for why a choice of activation function can have a noticeable effect on the success of NN training.

Perspectives

A version of this article was featured as a chapter in my PhD. I am happy to have it published at last!

Anna Bosman
University of Pretoria

Read the Original

This page is a summary of: Empirical Loss Landscape Analysis of Neural Network Activation Functions, July 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3583133.3596321.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page