Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling

Datong Zhang; Yuhui Deng; Yi Zhou; Yifeng Zhu; Xiao Qin

doi:10.1145/3459626

What is it about?

Data deduplication techniques construct an index consisting of ﬁngerprint entries to identify and eliminate duplicated copies of repeating data. The bottleneck of disk-based index lookup and data fragmentation caused by eliminating duplicated chunks are two challenging issues in data deduplica-tion. Deduplication-based backup systems generally employ containers storing contiguous chunks together with their ﬁngerprints to preserve data locality for alleviating the two issues, which is still inadequate.

Photo by hessam nabavi on Unsplash

Why is it important?

we propose a container utilization based hot ﬁnger-print entry distilling strategy to improve the performance of deduplication-based backup systems. We divide the index into three parts, namely, hot ﬁngerprint entries, fragmented ﬁngerprint en-tries, and useless ﬁngerprint entries. A container with utilization smaller than a given threshold is called a sparse container. Fingerprint entries that point to non-sparse containers are hot ﬁnger-print entries. For the remaining ﬁngerprint entries, if a ﬁngerprint entry matches any ﬁngerprint of forthcoming backup chunks, it is classiﬁed as a fragmented ﬁngerprint entry. Otherwise, it is classiﬁed as a useless ﬁngerprint entry.

This page is a summary of: Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling, ACM Transactions on Storage, November 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3459626.
You can read the full text:

Read

Contributors

The following have contributed to this page

Prof. Yuhui Deng
Jinan University

Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management