What is it about?

We present BAASH, Blockchain-As-A-Service for HPC, deployable in a plug-n-play fashion. BAASH bridges the HPC-blockchain gap with two key components: (i) Lightweight consensus protocols for the HPC's shared-storage architecture, (ii) A new fault-tolerant mechanism compensating for the MPI to guarantee the distributed resiliency. We have implemented a prototype system and evaluated it with more than two million transactions on a 500-core HPC cluster. Results show that the prototype of the proposed techniques significantly outperforms vanilla blockchain systems and exhibits strong reliability with MPI.

Featured Image

Why is it important?

Distributed resiliency becomes paramount to alleviate the growing costs of data movement and I/Os while preserving the data accuracy in HPC systems. This paper proposes to adopt blockchain-like decentralized protocols to achieve such distributed resiliency. The key challenge for such an adoption lies in the mismatch between blockchain's targeting systems (e.g., shared-nothing, loosely-coupled, TCP/IP stack) and HPC's unique design on storage subsystems, resource allocation, and programming models.

Perspectives

This paper proposes two key techniques to enable a blockchain service in HPC systems. First, a much-needed lightweight set of scalable consensus protocols is designed to account for both the diskless compute nodes and the remote shared storage in HPC. Second, in an HPC environment, a must-have property of blockchain, i.e., the reliability in front of MPI, is guaranteed by multiple layers of subsystems, from background daemon services to callable routines to applications. The HPC-aware consensus protocols and MPI-compatible reliability, under the framework coined as BAASH, collectively enable a blockchain service for HPC systems. Our future research will (i) incorporate a parallel data integration mechanism for the cross-blockchains or heterogeneous data sources, and (ii) support eventual consistency for readers to see data from the remote writers in the exascale system. Our hope is that BAASH could serve as a starting point for a new line of system research on decentralized services crafted to HPC systems

Abdullah Al Mamun
University of Nevada Reno

Read the Original

This page is a summary of: BAASH, November 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3458817.3476155.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page