Maintaining Sanity: Algorithm-based Comprehensive Fault Tolerance for CNNs

Jinhyo Jung; Hwisoo So; Woobin Ko; Sumedh Shridhar Joshi; Yebon Kim; Yohan Ko; Aviral Shrivastava; Kyoungwoo Lee

doi:10.1145/3649329.3657355

What is it about?

Soft errors, or transient bit flips, can cause unexpected behaviors in software programs. Traditional methods against soft errors incur large runtime or memory overheads, making them inapplicable to neural network applications which can have strict resource constraints. This paper introduces an efficient protection technique that can correct any single fault in a convolutional neural network, regardless of the location and timing of the fault.

Why is it important?

As neural networks are deployed even in safety-critical applications, malfunctions in the networks can lead to catestrophic consequences. Therefore, we apply an ABFT method combined with the idea of Hamming codes to correct faults in the weights and biases within a layer in the network. We also add a carefully crafted duplication-based roll-back recovery for faults in the intermediate inputs and outputs between the layers. This allows us to achieve near perfect fault coverage with 27% less runtime overhead and minimal memory overhead compared to traditional TMR-based methods.

This page is a summary of: Maintaining Sanity: Algorithm-based Comprehensive Fault Tolerance for CNNs, June 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3649329.3657355.
You can read the full text:

Read

Contributors

The following have contributed to this page

Enhancing the soft error reliability of convolutional neural networks

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Enhancing the soft error reliability of convolutional neural networks

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management