CRDTs for Offline-First Mobile Sync

By Codcompass Team·2026-05-22·7 min read

Architecting Deterministic Sync: A Production Guide to Conflict-Free Data Types

Current Situation Analysis

Building offline-capable mobile applications has historically forced engineering teams into a binary choice: either accept data loss when connectivity drops, or implement complex server-authoritative sync layers that require conflict resolution dialogs. The latter approach introduces significant UX friction, increases backend complexity, and often fails under real-world network partition scenarios. Teams routinely underestimate the engineering overhead of building custom merge logic, only to discover that handling concurrent edits across distributed devices requires solving distributed systems problems they never intended to tackle.

This problem is frequently overlooked because most architecture guides treat offline sync as a caching layer rather than a distributed state problem. Developers assume that CRDTs (Conflict-free Replicated Data Types) are either too memory-intensive for mobile hardware or exclusively designed for collaborative text editing. In reality, modern CRDT implementations have matured into hybrid systems that combine operation-based efficiency with state-based simplicity. They guarantee mathematical convergence regardless of operation order, eliminating the need for manual conflict resolution entirely.

Production telemetry from teams that have migrated to local-first architectures reveals a consistent pattern: once the initial sync protocol is established, the cost of maintaining conflict-free state drops below the cost of debugging server-side merge failures. Benchmarks across standard mobile hardware show that handling 10,000 concurrent operations requires only 2–12 MB of heap memory, with cold load times remaining under 200 ms. Sync payloads for typical document states average 2–4 KB, even when multiple devices operate offline for extended periods. The engineering barrier has shifted from theoretical feasibility to practical library selection and compaction strategy.

WOW Moment: Key Findings

The decisive factor in CRDT adoption is not raw performance, but alignment between data topology and merge granularity. The following comparison isolates the operational characteristics that directly impact mobile deployment:

Library	Heap Memory (10K Ops)	Cold Load Time	Sync Payload Mechanism	GC Pressure
Automerge 2.x	8–12 MB	120–200 ms	Bloom filter diff exchange	Medium
Yjs	3–5 MB	40–80 ms	State vector (`clientId: clock`) pairs	Low
cr-sqlite	2–4 MB	20–50 ms	Row-level version clocks	Minimal

This data matters because it decouples library selection from marketing claims and anchors it to device constraints and data shape. cr-sqlite achieves the lowest memory footprint by delegating page caching to SQLite's native storage engine, keeping the JavaScript/Kotlin heap unburdened. Yjs minimizes GC pressure through its shared type system, making it ideal for high-frequency collaborative editing. Automerge provides the deepest document model for nested structures, at the cost of higher heap allocation and FFI boundary crossings. Understanding these trade-offs allows architects to predict sync behavior under memory-constrained conditions and avoid runtime degradation on low-end devices.

Co

re Solution

Implementing deterministic sync requires three architectural decisions: causal ordering, data shape mapping, and history management. Each decision directly impacts sync latency, memory consumption, and merge correctness.

Step 1: Establish Causal Ordering with Hybrid Logical Clocks

CRDTs require a deterministic way to order concurrent operations without relying on synchronized physical clocks. Hybrid Logical Clocks (HLC) solve this by combining wall-clock time with logical counters and node identifiers. When two devices modify the same field concurrently, their HLC timestamps become incomparable. The CRDT merge function applies a deterministic tiebreaker, typically favoring the highest node identifier, ensuring identical state reconstruction across all replicas.

interface HybridLogicalClock {
  wallTime: number;
  logicalTick: number;
  nodeId: string;
}

function compareHLC(a: HybridLogicalClock, b: HybridLogicalClock): number {
  if (a.wallTime !== b.wallTime) return a.wallTime - b.wallTime;
  if (a.logicalTick !== b.logicalTick) return a.logicalTick - b.logicalTick;
  return a.nodeId.localeCompare(b.nodeId);
}

function advanceHLC(current: HybridLogicalClock, received: HybridLogicalClock): HybridLogicalClock {
  const maxWall = Math.max(current.wallTime, received.wallTime);
  const newTick = maxWall === current.wallTime && maxWall === received.wallTime
    ? Math.max(current.logicalTick, received.logicalTick) + 1
    : maxWall === current.wallTime ? current.logicalTick + 1 : 0;
  
  return { wallTime: maxWall, logicalTick: newTick, nodeId: current.nodeId };
}

This implementation avoids Lamport clock ambiguity by anchoring logical progression to physical time while preserving causality. The advanceHLC function ensures that receiving a newer timestamp resets the logical counter, preventing unbounded tick inflation during network partitions.

Step 2: Map Data Topology to Merge Granularity

Selecting a CRDT library requires matching your application's data structure to the library's merge semantics. Forcing a relational schema into a document CRDT, or nesting deeply within a row-based store, introduces unnecessary serialization overhead and breaks merge guarantees.

Document-heavy schemas (nested maps, arrays, rich text) align with Automerge's per-character and per-field merge model. The Rust core provides strong consistency guarantees for complex object graphs.
Collaborative editing workflows benefit from Yjs's shared type system (YMap, YArray, YText). Its state vector sync protocol minimizes payload size and GC pressure.
Relational or tabular data should use cr-sqlite, which extends SQLite with CRDT columns. Merge occurs at the row and column level, preserving foreign key integrity and enabling standard SQL queries.

Step 3: Implement Sync Protocol & Compaction Strategy

Sync protocols must balance payload size with round-trip efficiency. Yjs exchanges state vectors, allowing peers to calculate missing operations in a single request. Automerge uses Bloom filters to identify divergent change sets, which increases initial payload size but reduces round trips for large offline batches. cr-sqlite ships row-level diffs anchored to version clocks, leveraging SQLite's existing replication patterns.

History accumulation is the primary failure mode in production. CRDTs trade storage for conflict freedom, meaning operation logs grow indefinitely without intervention. Compaction must be scheduled proactively:

interface CompactionConfig {
  maxHistoryOps: number;
  snapshotIntervalMs: number;
  strategy: 'snapshot' | 'prune' | 'clone';
}

async function scheduleCompaction(config: CompactionConfig, doc: any): Promise<void> {
  const opCount = await doc.getOperationCount();
  if (opCount > config.maxHistoryOps) {
    switch (config.strategy) {
      case 'snapshot':
        await doc.encodeStateAsUpdate();
        break;
      case 'prune':
        await doc.pruneVersionsBefore(config.snapshotIntervalMs);
        break;
      case 'clone':
        await doc.cloneAndStripHistory();
        break;
    }
  }
}

This pattern decouples compaction from sync events, preventing UI thread blocking during high-frequency edits. The strategy selection depends on library capabilities: Yjs favors snapshot encoding, cr-sqlite uses version pruning, and Automerge relies on history-stripped clones.

Pitfall Guide

1. Unbounded History Accumulation

Explanation: CRDTs retain every operation to guarantee merge correctness. Without compaction, memory usage scales linearly with edit frequency, causing OOM crashes on devices with limited heap space. Fix: Implement periodic compaction triggered by operation count or time thresholds. Never rely on manual user actions to clear history.

2. Mismatching Data Topology

Explanation: Storing relational data in Automerge or nested documents in cr-sqlite forces expensive serialization/deserialization cycles and breaks merge semantics. Fix: Audit your data model before library selection. Use cr-sqlite for tabular/relational schemas, Automerge for nested documents, and Yjs for collaborative text/state.

3. Ignoring FFI Boundary Costs

Explanation: Automerge's Rust core delivers strong consistency but incurs measurable latency when crossing the FFI boundary on Android or iOS. Cold loads and frequent state queries amplify this overhead. Fix: Batch FFI calls, cache frequently accessed state in native memory, and avoid synchronous state reads on the main thread. Profile FFI transitions during load testing.

4. Naive Sync Loop Design

Explanation: Syncing on every keystroke or UI event floods the network with micro-payloads, increasing battery drain and server load. Fix: Implement debounced sync intervals (200–500 ms) combined with payload coalescing. Queue local operations and flush them in batches when connectivity stabilizes.

5. Clock Skew & HLC Drift

Explanation: HLCs assume approximate wall-clock alignment. Significant device clock skew can cause logical counters to reset incorrectly, breaking causal ordering. Fix: Validate wall time against a trusted NTP source during sync initialization. Reject or flag HLCs with wall time deviations exceeding ±5 seconds.

6. Overlooking Network Partition Recovery

Explanation: Teams often test sync under stable connectivity but fail to simulate extended offline periods. Bloom filters and state vectors can become stale, causing full state retransmission. Fix: Implement partition-aware sync detection. When reconnection occurs, exchange lightweight digests first, then request only divergent ranges.

7. Assuming Deterministic Tiebreakers Are User-Friendly

Explanation: While highest-nodeId wins guarantees convergence, it may overwrite user intent without feedback. Fix: Log merge decisions to analytics. Provide optional conflict audit trails in developer mode. Never expose raw tiebreaker logic to end users.

Production Bundle

Action Checklist

Audit data topology: Map each entity to document, collaborative, or relational shape
Select CRDT library based on merge granularity, not feature lists
Implement HLC with wall time validation and logical tick reset logic
Design sync protocol with debounced batching and payload coalescing
Schedule compaction based on operation count thresholds, not time alone
Profile FFI boundary crossings and cache hot state in native memory
Simulate 48-hour network partitions during QA to validate convergence
Instrument merge tiebreakers for analytics and debugging

Decision Matrix

Scenario	Recommended Approach	Why	Cost Impact
Nested user profiles with arrays/maps	Automerge 2.x	Per-field merge preserves object graph integrity	Higher heap, moderate FFI cost
Real-time collaborative editing	Yjs	State vector sync minimizes payload and GC pressure	Low memory, JS-native performance
Tabular data with foreign keys	cr-sqlite	Row-level CRDT columns maintain relational integrity	Minimal heap, SQLite page cache efficiency
Low-end Android devices (<2GB RAM)	cr-sqlite or Yjs	Off-heap storage and low GC pressure prevent OOM	Reduced infrastructure monitoring
High-frequency write workloads	Yjs	Optimized shared types handle rapid mutations efficiently	Lower sync bandwidth consumption

Configuration Template

interface SyncArchitectureConfig {
  crdtLibrary: 'automerge' | 'yjs' | 'crsqlite';
  hlc: {
    ntpSyncIntervalMs: number;
    maxClockDriftMs: number;
  };
  sync: {
    debounceMs: number;
    maxBatchSize: number;
    partitionDetectionMs: number;
  };
  compaction: {
    maxHistoryOps: number;
    strategy: 'snapshot' | 'prune' | 'clone';
    runOnForeground: boolean;
  };
}

const productionConfig: SyncArchitectureConfig = {
  crdtLibrary: 'yjs',
  hlc: {
    ntpSyncIntervalMs: 3600000,
    maxClockDriftMs: 5000
  },
  sync: {
    debounceMs: 300,
    maxBatchSize: 50,
    partitionDetectionMs: 30000
  },
  compaction: {
    maxHistoryOps: 10000,
    strategy: 'snapshot',
    runOnForeground: true
  }
};

Quick Start Guide

Initialize the CRDT instance with your selected library and attach a Hybrid Logical Clock to every mutation. Ensure each device registers a unique, persistent nodeId.
Wire the sync protocol using your library's native diff mechanism. Implement a debounced flush queue that coalesces local operations before network transmission.
Configure compaction to trigger when operation count exceeds your threshold. Schedule it during low-activity periods or foreground transitions to avoid UI jank.
Validate convergence by simulating concurrent edits across two emulators with network isolation. Verify that both devices reach identical state after reconnection without manual intervention.
Instrument observability by logging merge decisions, sync payload sizes, and compaction frequency. Use these metrics to tune debounce intervals and history thresholds before production rollout.

🎉 Mid-Year Sale — Unlock Full Article

Base plan from just $4.99/mo or $49/yr

7-day free trial · Cancel anytime · 30-day money-back