Building Document-Centric, CRDT-Native Editors

(blocksuite.io)

69 points | by rapnie a day ago ago

17 comments

  • blixt 2 hours ago

    I'm completely into the idea that your data should be document-centric, this I've seen working at a very large scale (i.e. 100+ MB JSON blobs being updated at 60 FPS with a large team of people editing at the same time) with the structure we built at Framer.

    What I'm still not sure about is how superior CRDTs are if you're building a centralized service. CRDTs can be great for offline mode or syncing data across long time periods, and not necessarily for an online and realtime multiplayer experience. I know they work just fine in such a situation, but they do come with additional cost in logical complexity, memory, and compute.

    Does anyone have experience with doing first "old school" sequential data patching (which would need some additional work to support undo and synchronization with clients that are trying to make simultaneous changes) and then switching to CRDT? Last time I tried it was too costly in every way, so I'm curious.

    • dartos 2 hours ago

      Specifically yjs provides a great experience for building CRDT editors.

      I think it’s a big selling point.

      I recently built a centralized, multiplayer, editor as a POC.

      I did one iteration with hand rolled operational transforms (what google docs uses) and one with yjs.

      Yjs just kind of works, where OTs require me to handle different kinds of data in bespoke ways and resolving conflicts is much more difficult due to the manual, one off nature of OT based systems.

      Yjs’s CRDT approach is much simpler to get going with and, imo, CRDTs are easier to reason about than OTs.

      Yjs’s implementation of CRDTs is quite a bit more space efficient than other implementations I’ve seen.

      I haven’t dealt with either in a production scenario yet, but I’m going to be moving forward with Yjs crdts. There are native bindings as well.

      • blixt 2 hours ago

        Yeah I think CRDT or OT can work well for multiplayer rich text editing for example, but then I would scope the domain to just that in such a way that you still keep the bulk of your document "dumb", e.g. as JSON patches on a big JSON object, maybe with some special logic for nicer array insertion/deletion.

        > OTs require me to handle different kinds of data in bespoke ways

        I'll admit I haven't tried to implement a general OT solution, but CRDTs are also extremely complex functions on the base building blocks of your data. I have tried Yjs and it does indeed hide away a lot of this complexity for you, but I would be surprised if there isn't an equivalent OT library (a quick Google turns up https://www.npmjs.com/package/@otjs/state-machine). Furthermore the CRDTs uses quite a lot of memory as your project scales up, so I'd keep an eye on that.

        I'd be curious to see more of your project even if it's still in progress, is there already something live?

  • PaulRobinson 4 hours ago

    Document-centric workflows were once the great promise of the future, and inspired Microsoft's OLE, Apple's "Publish and Subscribe", and then OpenDoc.

    I remember reading a computing magazine in the early 1990s that promised a future where we would decompose applications and the OS would only really worry about a file, and you would bring functionality to the file.

    You would in essence be able to build your own perfect word processing environment (for example), by bringing Company X's editing tools, Company Y's grammar checking and spelling tools, perhaps some embedded spreadsheet tables from Company Z if you were writing business reports, and so on.

    We kind of have this a little today with browser extensions, in that we can extend functionality onto a webpage we're viewing, but our environments are still very application-centric and not workflow or content-centric at all.

    This article shows an application that _might_ be interesting (and the CRDT is a mandatory requirement in today's environment), but while the OSes we use require us to do this sort of work in a windowed application, it won't quite appeal to me as having the full potential.

    I often think back to that article as it made me quite excited about the future of user interfaces and how operating systems could support workflows tailored to the individual and the task they wanted to achieve. This was all in a time when we had moderately novel ideas in OSes popping up (Windows NT, OS/2 Warp, NeXT, etc.), and just before the web was starting to get popular.

    • hprotagonist 31 minutes ago

      > You would in essence be able to build your own perfect word processing environment (for example), by bringing Company X's editing tools, Company Y's grammar checking and spelling tools, perhaps some embedded spreadsheet tables from Company Z if you were writing business reports, and so on.

      Tree-sitter, LSP, various linters -- we're getting there!

    • jwells89 2 hours ago

      In my mind, QuickLook on macOS is probably the closest thing we have to document-centrism in a modern mainstream OS, with how the OS discovers QuickLook extensions in app bundles and uses them to make QuickLook capable of understanding more document formats.

      The only problem is that it’s read-only. If QuickLook extensions could provide write capabilities too, you’d have 90% of a document centric setup.

    • dangom 43 minutes ago

      I think the content-centric model you are describing has been alive and thriving since at least the late 70s in Emacs.

    • api 3 hours ago

      The main reason we don't have a future like this isn't technical. It's that there's no business model here. Developers want apps that jail everything inside to force people to buy or subscribe to the app. Interoperability means no moat, and decomposing apps entirely means no products at all just a tool box that nobody can really control. So far nobody has ever found a way to reliably monetize this kind of software landscape.

      No business model means no long-term maintenance, no polish, no marketing, no attention to detail, no usability iterations, and so on. Quality software that is easy to use (especially for non-technical users) is extremely expensive. Developers are expensive. So instead you get "WIMPs" (Weakly Interacting Massive Programs) that win in the marketplace because they are more polished, more maintained, and more supported. (Because people pay for them.) Lately most of these WIMPs are in the cloud.

      The more I've matured as a developer the more I've realized that business models are the tail that wags the dog and that a lot of the landscape of software is defined by what people will pay for and/or what is structured so as to make people pay for it.

      Business models are also why things are increasingly centralized and in the cloud. It's not because it's inherently better, though it does make certain things easier to implement. In many cases it's a lot worse: higher latency, not available offline, much more limited and slower UI, etc. It's because the cloud is DRM and makes it easy to force a subscription. I mean look at Figma... there is no fundamental reason that had to be in the cloud except that it provides a ready-made subscription model that users can't evade.

      • enugu 3 hours ago

        This is something I think about, and there are some underlying technical issues as well, in coordinating all the plugins with common interfaces. We do not have workflow centric software even in the open source world which has managed to build a large family of apps in the usual mode.

        Or maybe we have done it in the past(interactive software in SmallTalk), but have forgot about it.

        Also, the business reasons are not prohibitive - if lot of users use the workflow model, there can be a store where they can request, raise funds for plugins with a specific functionality. Developers won't ignore it, even if the moat is weak, as there is potential revenue. It would be like consultants providing solutions rather than selling a product.

        • api 2 hours ago

          In the 1980s and 1990s there was a ton of very interesting work done on deeply thought out user-centric software designed to augment human intelligence and give people maximum control over what they were doing while still being approachable. The Smalltalk stuff was some of it, but there was some pretty spectacular stuff back in the old Windows 3.x, macOS classic, and even MS-DOS days where apps would interact richly and you had document-centric customizable work flows. You even had things like (gasp) composability of applications in GUIs.

          I mean look at this stuff you could do on a machine with 256KiB of RAM and an 8086: https://www.youtube.com/watch?v=KMUT9TEoe4Y&pp=ygUYbXMtZG9zI...

          All of this was completely abandoned and forgotten because there's no money in it. Make software like that and there's no moat, and make it local and people will pirate it. Lock it down in the cloud and lock down the data and people will pay you.

          That which gets funded gets built. We get shit because we pay for shit. People won't pay for good software because the flexibility and user-centrism of good software allows them not to.

          • enugu 11 minutes ago

            This is a very cool video. I am interested in making a dynamic notes app with user defined types and cutomizable flows. This is partly done already in Roam, AnyType... if you include the plugin system. But it seems like there is a lot of history.

            Nevertheless, I am more optimistic about such a product. The key issue is building composable interfaces to which plugins can glue in.

            Piracy is not such a big problem as one can give the product free of charge to users and charge for hosting or charge commercial businesses for whom piracy is not a risk worth taking.

            If anything, the business implication of a ubiquitous workflow interaction is its threat to the ads model (users dont visit social media sites but read in data via apis). The ad model might need to be replaced by something like micropayments to content creators, but that is a distant issue.

  • Rygian 6 hours ago

    I was very excited about the "Document-Centric" mention in the title, but the article does not really deliver in that regard (other than indicating that separating content from editor logic is important).

    What I expected instead: a way to consider the document as the first-class citizen in people's workflows (especially local-first flows, since CRDT is mentioned), which is a big challenge considering that all filesystems are still file-oriented (ie. if you want two versions of a document, you end up with two files, which then become two documents since there's nothing binding them).

  • tlarkworthy 4 hours ago

    I think we already standardized with the file as the unit of document exchange. You can make a CRDT interface to that file if you want, but CRDT is should not be the authority. There are too many competing implementations and what is best is too context specific and we already have the file as the standard everywhere so it's never gonna catch on

  • zknill 6 hours ago

    By putting all the state into a single CRDT then yes, you're going to get conflict resolution across the entire document. But you might also want to think about how big that CRDT state is going to get.

    There's no splitting or isolation available here, you need to load the entire Y.Doc to get any of the content of the page. The content of the Y.Doc is opaque bytes, and if you dump it to some other representation (like JSON) then you lose the internal Y.Doc counters that make conflict resolution work.

    • jitl 5 hours ago

      Yjs had a “subdocument” facility for dividing up content for lazy loading, although I’m not exactly sure if or how cross-doc transactions would work.

      https://docs.yjs.dev/api/subdocuments

      If you compare Ydoc for a very large page with many items to a web API maybe it seems like it could be a bit big, but traditional file document in an editor like Pages are frequently several megabytes, it really doesn’t seem spooky to me in a truly local first app where you will delta sync.

  • canadiantim 39 minutes ago

    very impressive, will be working with this going forward for sure. Big kudos to affine and the team behind this!

  • gerardnico 3 hours ago

    Slate is already document centric, man. There is also already a crdt plugin. https://docs.slatejs.org/walkthroughs/07-enabling-collaborat...

    But yeah …