What is it about?

Reducing the supply voltage can save energy, but may also trigger timing errors. This paper proposes an adaptive hardware/software error management policy that optimistically scales the supply voltage beyond the edge of safe operation for better energy savings using a low-complexity hardware scheme borrowed from data synchronization management.

Featured Image

Why is it important?

This proposed technique is simple and generally applicable. Moreover, it is capable of dealing with systems experiencing massive errors (that may be triggered by scaling beyond a "critical operating point") as well as those experiencing intermittent timing errors. For intermittent errors, we offer the capability of lowering the voltage beyond the point of first failure for better energy savings. Experiments on an embedded platform show our technique capable of 57% energy improvement compared to using voltage guardbands and an extra 21-24% improvement over existing state-of-the-art error tolerance solutions, at a nominal area and time overhead.

Perspectives

This work offered me an opportunity to re-evaluate Hardware Transactional Memory (HTM) beyond its traditional use of managing data synchronization conflicts. We had previously explored techniques to adapt HTM for embedded systems (where both performance and energy-efficiency must be carefully balanced), but the re-purposing of HTM for managing timing errors was as exciting new venture.

Iris Bahar
Brown University

Read the Original

This page is a summary of: Edge-TM, ACM Transactions on Embedded Computing Systems, October 2017, ACM (Association for Computing Machinery),
DOI: 10.1145/3126556.
You can read the full text:

Read

Contributors

The following have contributed to this page