Back to Search Start Over

A smart hybrid content-defined chunking algorithm for data deduplication in cloud storage.

Authors :
Ellappan, Manogar
Murugappan, Abirami
Source :
Soft Computing - A Fusion of Foundations, Methodologies & Applications. Aug2024, Vol. 28 Issue 15/16, p9037-9052. 16p.
Publication Year :
2024

Abstract

The enormous growth of data requires a large amount of storage space in the cloud server, which occupies mostly by the redundant data. The deduplication technique avoids redundancy to utilize cloud storage effectively. Chunking is the process to break the data into chunks, and determine duplicates. Many algorithms exist, to handle the deduplication efficiency, reducing the chunk variance, and improving the computational overhead continue to be a challenging task. To solve the above challenge, we propose a smart chunker (SC) algorithm, which operates with hybrid chunking based on file size as file-level and content-defined chunking (CDC). File-level chunking is assigned only for less than 2 KB file size, and the exceeding file size falls with CDC. We aim optimus prime chunking (OPC) in CDC to break the chunks with prime numbers and involves a minimum number of comparisons with fewer conditions and low computation overhead. This work reduces the processing time without hash and window for computation. Thus, it provides a constant average chunk size to distribute the equal chunk variance in the cloud storage with a 17% reduction in chunk time. Our OPC result attains high throughput of more than 3.3 × , compared to other CDC algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14327643
Volume :
28
Issue :
15/16
Database :
Academic Search Index
Journal :
Soft Computing - A Fusion of Foundations, Methodologies & Applications
Publication Type :
Academic Journal
Accession number :
179325710
Full Text :
https://doi.org/10.1007/s00500-023-09290-7