What is it about?
XEngine schedules operations of neural networks onto heterogeneous systems, such as a CPU-GPU setup, considering memory constraints and computational performance on each device. The core of our approach is a mixed integer quadratic program (MIQP). The solution to the MIQP is a schedule that determines where a tensor is computed and if the tensor is kept in memory until reusage in the backward pass. The other tensors are discarded and recomputed when needed either directly for operator computation or for recomputation of another tensor.
Photo by Clem Onojeghuo on Unsplash
Why is it important?
The decision, which forward tensors should be kept in memory until when is not trivial for even training linear neural networks on a single device. For more complex neural networks with skip connections and multiple devices, the problem gets more and more complicated. The number of possible decisions quickly explodes, since tensors can be recomputed from any other previous tensor following the computational chain of the network architecture and every tensor could be stored on any device.
Read the Original
This page is a summary of: XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments, ACM Transactions on Architecture and Code Optimization, December 2022, ACM (Association for Computing Machinery), DOI: 10.1145/3568956.
You can read the full text:
The following have contributed to this page