Back to Search Start Over

D$^{3}$ : A Dynamic Dual-Phase Deduplication Framework for Distributed Primary Storage

Authors :
Ying Li
Yan Tang
Jianwei Yin
Albert Y. Zomaya
Shuiguang Deng
Source :
IEEE Transactions on Computers. 67:193-207
Publication Year :
2018
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2018.

Abstract

Deploying deduplication for distributed primary storage is a sophisticated and challenging task, considering that the demands of low read/write latency, stable read/write performance, and efficient space saving are all of paramount importance. Unfortunately, existing schemes cannot present a satisfactory solution for the aforementioned requirements simultaneously. In this article, we propose D $^{3}$ , a dynamic dual-phase deduplication framework for distributed primary storage. Several major innovations are established in D $^{3}$ . First, we formulate a deduplication-oriented taxonomy called Dedup-Type , to group data with similar deduplication-related characteristics into larger categories. It serves as coarse-grained filter and one of the prioritizing references in D $^{3}$ . Second, D $^{3}$ is a dual-phase framework—inline-phase and offline-phase deduplication processes work in concert with each other. Third, D $^{3}$ operates in a dynamic manner. We design two critical mechanisms: context-aware threshold adjustment (CTA) for local inline-phase deduplication, and deferred priority-based enforcement (DPE) for global offline-phase deduplication. The CTA mechanism enables selective deduplication under a periodically updated threshold. Data skipped during the inline phase is regarded as a candidate for offline phase, and is handled in a prioritized order under the governance of DPE mechanism. Evaluation results demonstrate that, compared with conventional inline and offline deduplication schemes, D $^{3}$ achieves more efficient and stabler read/write performance with competitive space saving.

Details

ISSN :
00189340
Volume :
67
Database :
OpenAIRE
Journal :
IEEE Transactions on Computers
Accession number :
edsair.doi...........6e5f6c0888e9fd81d6f53f344e6618b6