← System Design Simulator

CRDTs

By Rahul Kumar · Senior Software Engineer · Updated · Category: Kleppmann · Designing Data-Intensive Applications

G-Counter, PN-Counter, OR-Set merging across replicas. Ch 9.

This interactive explanation is built for system design interview prep: step through CRDTs, watch the internal state change, and connect the concept to real distributed-system trade-offs.

Overview

Conflict-free Replicated Data Types (CRDTs) are data structures designed so that replicas can accept concurrent updates without coordination and still converge to the same state once they exchange messages. They solve the nightmare of multi-writer replication: two nodes accept the same logical operation at the same time, and the system must reconcile them without losing either. Traditional last-write-wins loses data; vector clocks detect conflicts but leave resolution to the application. CRDTs sidestep the problem by designing operations that are commutative, associative, and idempotent, so the merge function is mathematically guaranteed to produce the same result regardless of order. The two main families are state-based (CvRDTs), which ship full state and merge with a lattice join, and operation-based (CmRDTs), which ship operations over a reliable broadcast. Riak, Redis, Automerge, and every collaborative editor you have used ship CRDTs under the hood.

CRDTs — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive CRDTs widget — step through the algorithm or protocol and observe the internal state updating in real time.

How it works

The core invariant is that merge must be a join over a semi-lattice: commutative (A merge B = B merge A), associative ((A merge B) merge C = A merge (B merge C)), and idempotent (A merge A = A). A G-Counter (grow-only) achieves this by giving each replica its own counter slot; increments only touch the local slot, and merge takes the element-wise maximum. Since max is commutative, associative, and idempotent, the lattice works. PN-Counters combine two G-Counters, one for increments and one for decrements, and the value is their difference. Sets are harder because removal must not be undone by a concurrent add. OR-Sets (Observed-Remove) tag every add with a unique identifier; a remove only removes the specific tagged instances it has observed, so a concurrent add with a fresh tag survives. The cost is tombstones: removed tags must be remembered forever, or at least until all replicas have seen the remove, otherwise a late-arriving concurrent add would "come back from the dead." In practice CRDTs pair well with anti-entropy protocols (Merkle trees, gossip) that periodically synchronize replicas. They don't give you linearizability and they don't give you transactions; what they give you is partition-tolerance plus eventual convergence without conflict callbacks.

Implementation

GCounter: grow-only counter with per-node slots
import java.util.*;

/** Grow-only counter. Increments are local; merge is per-node max. */
public final class GCounter {
    private final String nodeId;
    private final Map<String, Long> counts = new HashMap<>();

    public GCounter(String nodeId) {
        this.nodeId = Objects.requireNonNull(nodeId);
        counts.put(nodeId, 0L);
    }

    /** Increment by a positive amount. Only the local slot is touched. */
    public void increment(long delta) {
        if (delta < 0) throw new IllegalArgumentException("use PNCounter for decrements");
        counts.merge(nodeId, delta, Long::sum);
    }

    /** Value is the sum across all node slots. */
    public long value() {
        long total = 0;
        for (long v : counts.values()) total += v;
        return total;
    }

    /** Merge: element-wise max. Commutative, associative, idempotent. */
    public void merge(GCounter other) {
        for (var e : other.counts.entrySet()) {
            counts.merge(e.getKey(), e.getValue(), Math::max);
        }
    }

    public Map<String, Long> snapshot() { return Map.copyOf(counts); }
}
PNCounter: increments and decrements via two GCounters
/** Positive-Negative counter: two GCounters, one for ++ and one for --. */
public final class PNCounter {
    private final GCounter positive;
    private final GCounter negative;

    public PNCounter(String nodeId) {
        this.positive = new GCounter(nodeId);
        this.negative = new GCounter(nodeId);
    }

    public void increment(long delta) {
        if (delta < 0) { decrement(-delta); return; }
        positive.increment(delta);
    }

    public void decrement(long delta) {
        if (delta < 0) throw new IllegalArgumentException("delta must be non-negative");
        negative.increment(delta);
    }

    public long value() {
        return positive.value() - negative.value();
    }

    /** Merge merges both halves; still a lattice join. */
    public void merge(PNCounter other) {
        positive.merge(other.positive);
        negative.merge(other.negative);
    }
}
ORSet: Observed-Remove Set with unique add tags
import java.util.*;

/** Observed-Remove Set. Adds are tagged with UUIDs; removes carry the tags
 *  they have observed, so a concurrent add with a fresh tag survives. */
public final class ORSet<T> {
    private final Map<T, Set<UUID>> adds = new HashMap<>();    // element -> live add tags
    private final Map<T, Set<UUID>> tombstones = new HashMap<>(); // element -> removed tags

    /** Add returns the tag used so the caller can later target this specific add. */
    public UUID add(T element) {
        UUID tag = UUID.randomUUID();
        adds.computeIfAbsent(element, k -> new HashSet<>()).add(tag);
        return tag;
    }

    /** Remove all add-tags for element we have currently observed. */
    public void remove(T element) {
        Set<UUID> observed = adds.getOrDefault(element, Set.of());
        if (observed.isEmpty()) return;
        tombstones.computeIfAbsent(element, k -> new HashSet<>()).addAll(observed);
    }

    /** Element is present iff it has at least one add tag not covered by a tombstone. */
    public boolean contains(T element) {
        Set<UUID> live = new HashSet<>(adds.getOrDefault(element, Set.of()));
        live.removeAll(tombstones.getOrDefault(element, Set.of()));
        return !live.isEmpty();
    }

    /** Merge unions add-sets and tombstone-sets per element. */
    public void merge(ORSet<T> other) {
        for (var e : other.adds.entrySet()) {
            adds.computeIfAbsent(e.getKey(), k -> new HashSet<>()).addAll(e.getValue());
        }
        for (var e : other.tombstones.entrySet()) {
            tombstones.computeIfAbsent(e.getKey(), k -> new HashSet<>()).addAll(e.getValue());
        }
    }

    public Set<T> elements() {
        Set<T> out = new HashSet<>();
        for (T e : adds.keySet()) if (contains(e)) out.add(e);
        return out;
    }
}

Complexity

Key design decisions & trade-offs

Common pitfalls

Interview follow-ups

Recommended reading

Related