What are you looking for ?
Advertise with us
RAIDON

R&D: Design of Fast Delta Encoding for Delta Compression Based Storage Systems

Evaluation results driven by seven real-world datasets suggest that Gdelta achieves encoding/decoding speedups of 3.5X∼25X over classic Xdelta and Zdelta approaches while increasing compression ratio by about 10%∼240%

ACM Transactions on Storage has published an article written by Haoliang Tan, Wen Xia, Harbin Institute of Technology, Shenzhen, China and Peng Cheng Laboratory, Shenzhen, China, Xiangyu Zou, Cai Deng, Harbin Institute of Technology, Shenzhen, China, Qing Liao, and Zhaoquan Gu, Harbin Institute of Technology, Shenzhen, China and Peng Cheng Laboratory, Shenzhen, China.

Abstract: Delta encoding is a data reduction technique capable of calculating the differences (i.e., delta) among very similar files and chunks. It is widely used for various applications, such as synchronization replication, backup/archival storage, cache compression, and so on. However, delta encoding is computationally costly due to its time-consuming word-matching operations for delta calculation. Existing delta encoding approaches either run at a slow encoding speed, such as Xdelta and Zdelta, or at a low compression ratio, such as Ddelta and Edelta. In this article, we propose Gdelta, a fast delta encoding approach with a high compression ratio. The key idea behind Gdelta is the combined use of five techniques: (1) employing an improved Gear-based rolling hash to replace Adler32 hash for fast scanning overlapping words of similar chunks, (2) adopting a quick array-based indexing for word-matching, (3) applying a sampling indexing scheme to reduce the cost of traditional building full indexes for base chunks’ words, (4) skipping unmatched words to accelerate delta encoding through non-redundant areas, and (5) last but not least, after word-matching, further batch compressing the remainder to improve the compression ratio. Our evaluation results driven by seven real-world datasets suggest that Gdelta achieves encoding/decoding speedups of 3.5X∼25X over the classic Xdelta and Zdelta approaches while increasing the compression ratio by about 10%∼240%.

Articles_bottom
ExaGrid
AIC
ATTO
OPEN-E