CDC: Why Decompression Is Worth the Complexity

(wael.nasreddine.com)

3 points | by kalbasit a month ago ago

1 comments

  • kalbasit a month ago

    Building a Nix cache server and faced a classic system design dilemma: chunk compressed data (fast/simple) or decompress first (slow/complex)?

    I tested 60k+ NAR files to find out.

    Compressed: 6.4% dedup hit rate Uncompressed: 47.8% dedup hit rate

    Decompression wins, saving 18% in total storage.

    (P.S. To handle the pipeline throughput, I also built the fastest FastCDC implementation in Go: https://github.com/kalbasit/fastcdc)