What is it about?

Dvé is a memory system architecture to improve the reliability and performance of DRAM main memory. Dvé achieves improved reliability by performing replication in hardware between 2 sockets of a cache coherent NUMA system.Additionally, it provides performance benefits by modifying the NUMA cache coherence protocol to allow memory to be read from the nearest replica.

Featured Image

Why is it important?

What does it provide? ◉ higher memory reliability than any commercially-available, high memory reliability products like Chipkill ECC, Intel Partial Address Space Mirroring, IBM RAIM ◉ improves memory performance in a multi-socket NUMA architecture (lower memory access latency and higher bandwidth) ◉ provides on-demand replication for programs at runtime by using the idle memory present in systems ◉ strict recovery semantics and strongly consistent replication How does it achieve it? ▣ Replicates memory on two different sockets of a multi socket NUMA system ▣ Uses cache coherence protocols (allow-based and deny-based) to provide "Coherent Replication" ▣ Builds on existing reliability mechanisms and coherence protocols for graceful degradation and fallback on failure ▣ Maps data in DRAM between replicas in a thermal risk aware manner to reduce failures ▣ Sets out the application interface and OS mechanisms required for providing replicating memory at runtime

Perspectives

Dvé is inspired by distributed system deployments where replication is frequently employed for fault tolerance and performance. We bring this insight into shared memory architecture. Dvé's philosophy is rooted in the holistic design approach, leveraging the time-tested "end-to-end argument". We advocate exposing faults in DRAM memory to the highest-level end point of memory (i.e., the memory controller level), thereby subsuming all other types of errors. The title of the project - Dvé - is derived from the Sankrit word (द्वे) which means "the two", referring here to the dual benefits of replication Dvé can be simply summarized as Replicate DRAM + Reduce Memory Errors + Improve Performance

Adarsh Patil
University of Edinburgh

Read the Original

This page is a summary of: Dvé: Improving DRAM Reliability and Performance On-Demand via Coherent Replication, June 2021, Institute of Electrical & Electronics Engineers (IEEE),
DOI: 10.1109/isca52012.2021.00048.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page