Cool to see a write up on this! Discovered the same method years ago and also wondered why it doesn’t show up in academic literature.
I implemented this in a decentralized context and as a CRDT though, so that the properties hold (commutative, idempotent and associative).
Surprised to see no discussion of other data structures like dicts/maps, or arrays of arbitrary type. Hopefully they'd be a straightforward extension. IME, apps need collaborative data structures more often than they need pure collaborative text editing.
The motivating examples (update validation, partial loading, higher-level operations) are interesting, but I don't see a strong case that the reason Yjs etc. lack these features is the underlying CRDT implementation, as opposed to these features being intrinsically difficult to build.
Ok, so the main point that makes it different from CRDTs seems to be: if you have a central server, let the server do the synchronization (fixing an order among concurrent events), and not the data structure itself via an a-priori order.
Because all communication is between client and server, and never between clients, when the client connects to the server, the server can make sure that it first processes all of the client's local operations before sending it new remote updates.
Is the take-home message of the post that the full complexity of CRDTs/OT is necessary only in the absence of a central server?
I'm not an expert on this, but the main difference with a CRDT like Automerge seems to be the server reconciliation. See for example this article [1]. Automerge handles concurrent insertions by using a sequence number and relying on an agreed ordering of agent ids when insertions are concurrent, while this scheme relies on the server to handle them in the order they come in.
The article mentions this:
> This contrasts with text-editing CRDTs, in which IDs are ordered for you by a fancy algorithm. That ordering algorithm is what differs between the numerous text-editing CRDTs, and it’s the complicated part of any CRDT paper; we get to avoid it entirely.
I can buy the idea that many apps have a central server anyway, so you can avoid the "fancy algorithm", though the server reconciliation requires undoing and replaying of local edits, so it's not 100% clear to me if that's much simpler.
so, an unoptimized crdt? just set max set size to 1 and yolo?
Use of server reconciliation makes me think client-side reconciliation would be tricky… how do you preserve smooth editor UX while applying server updates as they arrive?
For example, if your client-sent request to insert a character fails, do you just retry the request? What if an update arrived in the intervening time? (Edit: they acknowledge this case in the “Client-Side” section, the proposal is to rewind and replay, and a simpler proposal to block until the pending queue is flushed)
From a frontend vantage I feel like there may be a long tail of underspecified UI/UX edge cases, such that CRDT would be simpler overall. And how does the editor feel to use while riding the NYC subway where coverage is spotty?
Someone should try to use a local LLM (maybe a 4b) to merge the diffs in case of conflict beyond straightforward cases...
Not energy efficient but should work surprisingly well without CRDT, OT, or anything else.
Is collaborative text editing with offline sync a nerd snipe [0]? I work for a big tech and write a lot and usually worse case someone else edits at the same time and the server can figure it out. yes it needs some kind of algo but most concurrent edits are on different parts of a huge doc.
Compare this to Git workflows. Git already handles merging most changes seamlessly.
wonder if there would be a perf gain with UUIDv7
Does this finally solve collaborative text editing and its friends?
Such an awesome approach.
[flagged]
This is technically a CRDT. It's just that the "order of operations" to apply over the doc is now resolved using a central server. For context, this is exactly how Google Docs and Zoho Writer works currently. Except that, they use OT with central-server based reconciliation and the proposal uses CRDT-istic approach.
I agree this is more practical if your service anyway run over centralised servers (aka cloud)
That is very neat. The algorithm:
- Label each text character with a globally unique ID (e.g., a UUID), so that we can refer to it in a consistent way across time - instead of using an array index that changes constantly.
- Clients send the server “insert after” operations that reference an existing ID. The server looks up the target ID and inserts the new characters immediately after it.
- Deletion hides a character for display purposes, but it is still kept for "insert after" position purposes.
This might have potential outside text editing. Game world synchronization, maybe.