How to Implement Offline-First Sync Systems Like Notion

Team 5 min read

#offline-first

#webdev

#tutorial

#collaboration

Overview

Offline-first synchronization lets users continue working even when the network is unavailable, then seamlessly merge changes when connectivity returns. In a Notion-like editor, this means local edits, block-level updates, and distributed changes must converge without overwriting user intent. This post outlines a practical approach to implementing offline-first sync with strong conflict handling, robust data modeling, and a smooth user experience.

Core concepts

Local-first data store: All edits are captured locally before being synced, ensuring fast feedback and uninterrupted work.
Conflict-free data types: CRDTs (Conflict-Free Replicated Data Types) let multiple concurrent edits converge deterministically.
Append-only operation logs: Changes are recorded as ops or deltas with causal metadata to facilitate reconciliation.
Block-based document model: Notion-style editors organize content as blocks; syncing operates at the block or sub-block level to preserve structure.

Architecture patterns

Local database and change log: Use a local store (e.g., IndexedDB) to persist documents, blocks, and a history of operations.
Sync engine: A background service or worker handles outbound/inbound changes, retries, and backoff.
Server-side merge: A central service applies incoming changes, resolves conflicts using CRDTs or OT-like strategies, and broadcasts merged updates.
Data model separation: Distinguish document state (blocks, metadata) from synchronization state (sequence numbers, actor IDs).

Data model decisions

Block-based documents: Represent documents as a tree of blocks (paragraphs, headings, lists, media). Each block has a stable ID, parent linkage, and version vector.
Identifiers: Generate globally unique IDs for blocks and documents to avoid clashes during offline edits.
Versioning: Track causal metadata (actor ID, logical clock, timestamp) to order edits and detect conflicts.
Rich metadata: Attach provenance data (who edited what and when) to simplify UX for conflict resolution.

Conflict resolution strategies

CRDT-based convergence: Use a CRDT library to automatically merge concurrent edits at the block level. Prefer operations that are commutative and associative to minimize manual conflict handling.
Intent-preserving merges: When automatic merging is not possible, surface conflicts in the UI with clear options (keep local, accept remote, or merge manually).
Graceful degradation: If a conflict cannot be resolved automatically, present a conflict state that allows the user to pick the correct version for each conflicting block.

Synchronization protocol

Local change capture: Record edits as operations in a local log with metadata (author, timestamp, parent IDs).
Outbound sync: Push changes to the server when online; compress batched ops and attach a snapshot of the current document state.
Inbound sync: Receive merged changes from the server, apply them locally, and resolve any cross-site conflicts using the same CRDT rules.
Causal delivery: Ensure changes are applied in a causally consistent order to avoid misalignment of block relationships.
Connectivity handling: Implement exponential backoff, offline queueing, and reliable retries to tolerate intermittent networks.

Implementation stack (practical options)

Client: Framework-agnostic or Astro/React, with a UI that reflects real-time edits and conflict status.
Local storage: IndexedDB (for complex objects) or SQLite via a WASM layer if you need richer queries.
Sync transport: WebSocket for real-time updates or HTTP long-polling for server-triggered pushes; consider WebRTC for decentralized setups.
Synchronization core: CRDT libraries such as Y.js or Automerge for deterministic convergence.
Server: A merge service that applies incoming CRDT updates, stores document state, and broadcasts merged changes to all clients.
Security: Encrypt data at rest and in transit; manage keys to protect sensitive blocks or documents.

Example integration pattern

Use Y.js for CRDT merging and IndexedDB for local persistence.
Initialize a Y.Doc in each client, attach a WebSocket to exchange updates, and persist the document state to IndexedDB after every change.

Pseudo-workflow:

User edits a block -> Y.js applies local change and emits an update.
Update is serialized and saved to the local oplog in IndexedDB.
Change is sent to the server via WebSocket.
Server merges updates into a canonical state (using CRDT semantics) and broadcasts the merged doc.
Clients apply inbound updates to their local Y.Doc and persist to IndexedDB, maintaining causal order.

Code sketch (conceptual):

Initialize Y.js document and binding to UI
Persist doc state to IndexedDB on every change
Listen for remote updates and apply them to Y.Doc

Note: This is a high-level sketch. When implementing, tailor the APIs to your tech stack and ensure correct handling of block relationships and nested structures.

UI and UX considerations

Conflict indicators: Clearly flag blocks with unresolved conflicts and offer inline resolution actions.
Offline indicators: Show connectivity status and the pending change count.
Previews and snapshots: Let users preview how the merged result will look after sync, before applying it.
Performance feedback: Debounce updates to the UI during large edits to avoid jank.

Testing and observability

Network partition testing: Simulate long offline periods and flaky networks to verify reconciliation correctness.
Deterministic replays: Reproduce issues by replaying a known sequence of edits and verifying the final state matches expectations.
Metrics: Track time-to-merge, conflict rate, and divergence between clients.
Logging: Emit structured logs for local edits, outbound/inbound changes, and merge decisions to facilitate debugging.

Security and privacy considerations

Data at rest: Encrypt sensitive document blocks locally.
Data in transit: Use TLS for all sync traffic with integrity checks.
Access control: Enforce per-document permissions and block-level access rules.
Auditability: Preserve a tamper-evident log of edits to support traceability and rollback if needed.

Putting it together: a practical blueprint

Model documents as block trees with stable IDs and causal metadata.
Implement a local CRDT-based core to merge concurrent edits deterministically.
Persist state and change logs in a local store (IndexedDB) with a clean API surface.
Establish a reliable, authenticated sync channel to a central server that performs CRDT-based merges.
Build UX around clear conflict resolution, offline status, and seamless re-sync.
Add testing harnesses that simulate offline work, concurrent edits, and network failures.
Prioritize security with encryption, access controls, and auditing.

Conclusion

Offline-first sync systems enable powerful, responsive collaboration akin to Notion by combining a block-based data model, CRDT-based convergence, and careful UX design. By structuring the architecture around local edits, deterministic merges, and robust synchronization pipelines, you can deliver a seamless experience that remains consistent and conflict-resilient across devices and networks.

Share this article

Share on Twitter Share on LinkedIn