What is it about?
Currently, most of the training strategies for deep neural networks (DNNs) are based on gradient descent. Although they are popular, they are far from being perfect and people are complaining about the drawbacks such as gradient-vanishing, poor conditioning, biological implausibility, and low concurrency. To address the fundamental drawbacks of the gradient-based methods, a new framework is proposed for training deep neural networks based on alternating direction methods of multipliers (ADMM) with convergence guarantee no matter how this method is initialized.
Featured Image
Photo by Mika Baumeister on Unsplash
Why is it important?
Gradient-free optimization for training DNN contains a number of young yet promising topics. For example, we are focusing on alternating optimization-based optimizers for training deep neural networks, which first transfer a DNN training problem into an equivalent, decomposable one. Then we decompose the problem into subprogblems corresponding to each layer which is solved separately and easily with an analytical solution.
Read the Original
This page is a summary of: ADMM for Efficient Deep Learning with Global Convergence, July 2019, ACM (Association for Computing Machinery),
DOI: 10.1145/3292500.3330936.
You can read the full text:
Contributors
The following have contributed to this page







