Back to Search Start Over

Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge.

Authors :
Fu, Min
Feng, Dan
Hua, Yu
He, Xubin
Chen, Zuoning
Liu, Jingning
Xia, Wen
Huang, Fangting
Liu, Qing
Source :
IEEE Transactions on Parallel & Distributed Systems; Mar2016, Vol. 27 Issue 3, p855-868, 14p
Publication Year :
2016

Abstract

In backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. We observe that the fragmentation comes into sparse and out-of-order containers. The sparse container decreases restore performance and garbage collection efficiency, while the out-of-order container decreases restore performance if the restore cache is small. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. CAF efficiently complements HAR in datasets where out-of-order containers are dominant. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2.84-175.36 $\times$<alternatives><inline-graphic xlink:type="simple" xlink:href="fu-ieq1-2410781.gif"/></alternatives> at a cost of only rewriting 0.5-2.03 percent data. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10459219
Volume :
27
Issue :
3
Database :
Complementary Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
113070563
Full Text :
https://doi.org/10.1109/TPDS.2015.2410781