Back to Search Start Over

Using multi-threads to hide deduplication I/O latency with low synchronization overhead.

Authors :
Zhu, Rui
Qin, Lei-hua
Zhou, Jing-li
Zheng, Huan
Source :
Journal of Central South University; Jun2013, Vol. 20 Issue 6, p1582-1591, 10p
Publication Year :
2013

Abstract

Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/O intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/O latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/O latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/O latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3-5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5-2 times performance improvements comparing to traditional lock-based synchronization methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20952899
Volume :
20
Issue :
6
Database :
Complementary Index
Journal :
Journal of Central South University
Publication Type :
Academic Journal
Accession number :
87989350
Full Text :
https://doi.org/10.1007/s11771-013-1650-4