Sorted string tables (SST) from first principles

(bitsxpages.com)

62 points | by apurvamehta 4 days ago ago

10 comments

The diagrams on this page are stunning! My only complaint is leaving the close/maximize/minimize buttons in the top left was unnecessary but this is the kind of clarity I always strive for (and fail to achieve) every time I make diagrams.

Did you use a tool to create them, and if so, what is that tool?

[-]

agavra 7 hours ago

Thanks! I use https://monodraw.helftone.com/ which is my favorite one-time-purchase software of all time. I definitely agree the buttons on the top left are unnecessary but ... it's cute and it makes me happy so I can't help it. Maybe I'll come up with a different style for the next blog

epistasis 11 hours ago

> There are several ways to organize immutable data durably that meet these requirements, the simplest of which is an append-only log.

This is also a fairly good way to handle large amounts of data with maximum performance on spinning rust, and at the heart of systems like Kafka.

I had assumed that the story would be very different with SSDs, so it's surprising to see append only logs show up again.

[-]

agavra 7 hours ago

We haven't even started to discuss Object Storage, but it ends up looking very very similar if you're building data systems that use that instead of raw filesystems (not so much for physics reasons, but because of the way object storage require immutable objects and penalize you for many API calls)

agavra 7 hours ago

So cool to see this make the front page of hacker news! I'm the author, I'll be online here throughout the weekend to answer any questions you might have :) excited for the next post which is in the works about LSM trees.

mac3n 10 hours ago

see https://gitlab.com/mac3n/ksip binary search on mmpa'd sorted text files no index needed

[-]

agavra 7 hours ago

this is pretty different but reminds me of https://en.wikipedia.org/wiki/Bitcask - if you're storing it all in memory why not just use a hash index?

[-]

mac3n 5 hours ago

The trick is, it's not all in memory - it's a memory-mapped file If you look at the cache (with `fincore` or similar) you'll see that the buinary search only loads the pages it examines, roughly logarithmetic in the file size.

And a text file is the most useful general format - easy to write, easy to process with standard tools.

I've used this in the past on data sets of hundreds of millions of lines, maybe billions.

It's also true that you could use a memory-mapped indexed file for faster searches - I've used sqlite for this.

mac3n 10 hours ago

what this could really use is a compression format that compresses variable amount of text into fixed-size blocks. with that, it could binary-search compressed text

[-]

agavra 6 hours ago

RocksDB actually does something somewhat similar with its prefix compression. It prefix-compresses texts and then "resets"the prefix compression every N records so it stores a mapping of reset point -> offset so you can skip across compressed records. It's pretty neat