The CRDT Dictionary: A Field Guide to Conflict-Free Replicated Data Types

(iankduncan.com)

103 points | by birdculture 8 hours ago ago

7 comments

btown 2 hours ago

One of the most interesting things to me about CRDTs, and something that a skim of the article (with its focus on low-level CRDTs) might give the wrong impression on... is that things like https://automerge.org/ are not just "libraries" that "throw together" low-level CRDTs. They are themselves full CRDTs, with strong proofs about their characteristics under stress.

Per the Automerge website:

> We are driven to build high performance, reliable software you can bet your project on. We develop rigorous academic proofs of our designs using theorem proving tools like Isabelle, and implement them using cutting edge performance techniques adopted from the database world. Our standard is to be both fast and correct.

While the time and storage-space performance of these new-generation CRDTs may not be ideal for all projects, their convergence characteristics are formalized, proven, and predictable.

If you're building a SaaS that benefits from team members editing structured and unstructured data, and seeing each others' changes in real time (as one would expect of Notion or Figma), you can reach for CRDTs that give you actionable "collaborative deep data structures" today, without understanding the entire history of the space that the article walks through. All you need for the backend is key-value storage with range/prefix queries; all you need for the frontend is a library and a dream.

[-]

michelpp 6 minutes ago

Automerge is an excellent library, with a great API, not just in Rust, but also Javascript and C.

> All you need for the backend is key-value storage with range/prefix queries;

This is true, I was able to quickly put together a Redis automerge library that supports the full API, including pub/sub of changes to subscribers for a full persistent sync server [0]. I was surprised how quickly it came together. Using some LLM assistance (I'm not a frontend specialist) I was able to quickly put together a usable web demo of synchronized documents across multiple browsers using the Webdis [1] websocket support over pub/sub channels.

[0] https://github.com/michelp/redis-automerge

[1] https://webd.is/

tbrownaw 25 minutes ago

what this calls OR-Set looks equivalent to what Monotone uses (used? It's kinda mostly dead now) for merging scalar values (eg names, content hashes) since 2005.

The best current page I can find is https://tonyg.github.io/revctrl.org/MarkMerge.html . Boo link rot.

rdtsc 3 hours ago

That's a great summary of CRDTs, starting from the basics and to the more advanced ones.

Speaking of Riak, it's still around, in the form of https://github.com/OpenRiak!

fellowniusmonk 2 hours ago

CRDTs are something you still have to write by hand, I finished creating a custom sequence based CRDT engine about 2 months ago (inspired by diamond types) and it was hilarious to ask Ai for assistance.

It's interesting when you are working on something that:

1. Is essentially a logic problem.

2. That LLMs aren't trained on.

3. That can have dense character sequences when testing.

4. To see how completely useless an LLM is outside of pre-trained areas.

There needs to be some blackbox test based on pure but niche logic to see if an LLM model is capable of understanding and even noticing exposure to new logics.

[-]

canadiantim an hour ago

What about just using something like Loro?

[-]

fellowniusmonk an hour ago

I love Loro and its probably my favorite open source project (you can see me refer to it as such in my comment history), I have a very specific multi CRDT and search indexing architecture that precluded me from using it.