I've found that Asymmetric Numeral Systems (you mentioned it briefly) is the optimal practical method for pure entropy encoding. I just posted this https://news.ycombinator.com/item?id=47806122
Unfortunately it has been dormant for some time but there are years worth of useful information there and he is an uncommonly good presenter of technical knowledge through the written word.
My compression algo explorations are like font explorations. I spend a lot of time doing research and testing, but I (almost) always end up coming back to gzip / arial.
One notable exception is that for very large files (e.g. 10GB+ mbox archives), we found 7z compressed to 39% and gzip 65%. 7z was about 10% faster as well.
zstd beats gzip on both speed and size, for every compression level.
If you need compatibility then gzip (pigz) or zip (7z) or bz2 (pbzip2) are the best of worse outcomes, but for Pareto front optimal speed and size you want zstd.
I've found that Asymmetric Numeral Systems (you mentioned it briefly) is the optimal practical method for pure entropy encoding. I just posted this https://news.ycombinator.com/item?id=47806122
For anyone who already has at least a surface level understanding of compression and wants to take a deeper dive, check out Charles Bloom's blog:
http://cbloomrants.blogspot.com
Unfortunately it has been dormant for some time but there are years worth of useful information there and he is an uncommonly good presenter of technical knowledge through the written word.
My compression algo explorations are like font explorations. I spend a lot of time doing research and testing, but I (almost) always end up coming back to gzip / arial.
One notable exception is that for very large files (e.g. 10GB+ mbox archives), we found 7z compressed to 39% and gzip 65%. 7z was about 10% faster as well.
zstd beats gzip on both speed and size, for every compression level.
If you need compatibility then gzip (pigz) or zip (7z) or bz2 (pbzip2) are the best of worse outcomes, but for Pareto front optimal speed and size you want zstd.
[Deleted]
It has DEFLATE code, Snappy code, LZ4 code, ZSTD exploration, and describes many involved sub-algorithms, with diagrams - what more were you wanting?