Real-Time Collaboration in the Browser Using CRDTs

Team 6 min read

#webdev

#crdt

#real-time

#collaboration

#tutorial

Introduction

Real-time collaboration in the browser is now practical for apps ranging from document editors to shared whiteboards. CRDTs, or Conflict-free Replicated Data Types, enable multiple clients to edit a shared state and converge without heavy, centralized conflict resolution. This post walks through the basics, architectural patterns, and a small, client-side example to illustrate how CRDTs power live collaboration.

CRDTs 101

CRDTs are data structures designed for distributed systems to merge divergent states deterministically. There are two broad families: state-based (convergent) CRDTs and operation-based (op-based) CRDTs. State-based CRDTs merge by exchanging full or partial state, while op-based CRDTs merge by propagating operations. Common building blocks include counters, registers, and sets; more advanced CRDTs enable ordered sequences and text editing. The key promise is eventual consistency without requiring a single point of coordination.

Real-time collaboration in the browser: challenges

  • Latency and bandwidth: updates must propagate quickly, but users on slow networks expect a responsive UI.
  • Offline editing: users should continue working when disconnected and have changes merge cleanly on reconnect.
  • Convergence and ordering: ensuring all clients arrive at the same end state after many interleaved edits.
  • Security and access control: sharing a CRDT state raises concerns about who can read or write data.

CRDTs address these challenges by design, but practical implementations must choose architectures and libraries that fit the app’s latency, storage, and security requirements.

Architectural patterns for browser-based collaboration

  • Central server with a CRDT store: Clients send deltas (or state) to a server, which merges and propagates updates to other clients. This is common for apps with authenticated users and centralized access control.
  • Serverless or edge-backed: Clients exchange updates via WebSockets or WebRTC, with a serverless backend (or peers) handling synchronization. Useful for reducing central points of failure and latency.
  • Client-side CRDT libraries: Libraries like Yjs and Automerge provide CRDT implementations and helpers for syncing between clients. They abstract away many low-level details and integrate with frameworks and editors.

Common synchronization patterns:

  • Delta-based syncing: send small changes (deltas) rather than entire state to save bandwidth.
  • State merges: when two clients diverge, merge their CRDT states deterministically to achieve convergence.
  • Presence and metadata: track who is online and cursors, separate from the document state, to avoid conflating user metadata with content edits.

A simple CRDT library example (client-side)

Below is a tiny, self-contained OR-Set (Observed-Remove Set) CRDT in JavaScript. OR-Set lets multiple clients add and remove elements with unique tags, then merges to produce a consistent view.

// A minimal OR-Set (client-side)
class ORSet {
  constructor() {
    this.adds = new Map();    // element -> Set<id>
    this.removes = new Map(); // element -> Set<id>
  }

  // Add an element with a unique id (e.g., client + counter)
  add(element, id) {
    if (!this.adds.has(element)) this.adds.set(element, new Set());
    this.adds.get(element).add(id);
  }

  // Remove a specific add-tag for an element
  remove(element, id) {
    if (!this.removes.has(element)) this.removes.set(element, new Set());
    this.removes.get(element).add(id);
  }

  // Compute current value (elements with at least one add-id not removed)
  value() {
    const result = [];
    for (const [element, addIds] of this.adds.entries()) {
      const remIds = this.removes.get(element) || new Set();
      for (const id of addIds) {
        if (!remIds.has(id)) {
          result.push(element);
          break;
        }
      }
    }
    return result;
  }

  // Merge another ORSet into this one
  merge(other) {
    // merge adds
    for (const [element, addIds] of other.adds.entries()) {
      if (!this.adds.has(element)) this.adds.set(element, new Set());
      const mySet = this.adds.get(element);
      for (const id of addIds) mySet.add(id);
    }
    // merge removes
    for (const [element, remIds] of other.removes.entries()) {
      if (!this.removes.has(element)) this.removes.set(element, new Set());
      const myRem = this.removes.get(element);
      for (const id of remIds) myRem.add(id);
    }
  }
}

// Example usage
const a = new ORSet();
a.add('paragraph-1', 'clientA-1');
a.add('paragraph-2', 'clientA-2');

const b = new ORSet();
b.add('paragraph-1', 'clientB-3');
b.remove('paragraph-2', 'clientA-2');

a.merge(b);
console.log(a.value()); // ['paragraph-1', 'paragraph-2'] or just the current visible elements depending on removes

This tiny example demonstrates the core idea: adds are tagged, removes target specific tags, and merges reconcile divergent histories deterministically. Real-world apps typically use more feature-rich CRDTs (text edits, arrays, presence) provided by libraries like Yjs or Automerge, which handle more complex data types and optimizations.

Integrating with a real-world editor

For practical apps, pairing a CRDT engine with a text or rich-content editor is common. Libraries such as Yjs expose CRDT-backed data types (Y.Text, Y.Array) and connect with frameworks via bindings. A typical setup:

  • Client maintains a CRDT document in memory.
  • User edits update the CRDT locally (offline first).
  • Deltas are sent to peers or a server, which merges them into the shared document.
  • Other clients receive updates and apply them to their local CRDT, resulting in real-time convergence.

Handling offline usage and conflict resolution

  • Offline-first: keep a local CRDT copy and queue outgoing updates until connectivity returns.
  • Deterministic merges: CRDTs ensure that independent edits converge the same way across clients, removing the need for complex server-side conflict resolution.
  • Reconciliation: when reconnecting, apply incoming deltas in a way that preserves local edits while integrating remote changes.

Observability and debugging

  • Log operation histories: track adds/removes and merges to understand how the state evolves.
  • Use synthetic partitions: simulate delays and disconnections during testing to observe convergence behavior.
  • Visualize state: render a summary of the current CRDT state and the derived visible content to verify correctness.

Performance considerations

  • Payload size: delta-based syncing reduces bandwidth; avoid sending whole state on every update.
  • Memory usage: CRDTs grow with unique tags; consider pruning strategies and garbage collection where safe.
  • Garbage collection: manage tombstones (removed tags) and use compact representations when possible.
  • Library choice: battle-tested libraries like Yjs provide optimizations, bindings, and tooling for production use.

Security and privacy considerations

  • Encrypt data in transit: use TLS/WebSocket encryption for all sync traffic.
  • Access control: enforce permissions on who can read or write a document, ideally at the application layer and, if possible, at the server layer.
  • Minimize exposure: only share the CRDT state necessary for collaboration; avoid leaking unrelated metadata.

Conclusion

Real-time browser collaboration is achievable with CRDTs, combining deterministic convergence with flexible architectures. By choosing the right CRDT type for your data, leveraging library-backed synchronization, and designing for offline-first usage, you can deliver responsive, conflict-free collaborative experiences without fragile central locking or complex conflict resolution logic.