← System Design Simulator

CAP under Partition

By Rahul Kumar · Senior Software Engineer · Updated · Category: System Design Primer · Unique Topics

3-node cluster, partition + heal. CP rejects minority writes; AP diverges.

This interactive explanation is built for system design interview prep: step through CAP under Partition, watch the internal state change, and connect the concept to real distributed-system trade-offs.

Overview

CAP is the theorem that forces every distributed system to pick a side during a network partition. Eric Brewer's 2000 conjecture, formalized by Gilbert and Lynch in 2002, says a system cannot simultaneously guarantee Consistency (every read sees the most recent write), Availability (every request gets a non-error response), and Partition tolerance (the system keeps working when messages between nodes are dropped). Because partitions are a fact of physics, not a design choice, real systems get to pick only between CP (refuse requests on the minority side) and AP (accept writes on both sides, reconcile later). The theorem is often misunderstood as a three-way trade-off; it is really a binary choice made at partition time. When the network is healthy, both CP and AP systems can be consistent and available. CAP only matters once packets start dropping. Understanding CAP is the difference between picking a database that loses data and picking one that refuses to accept it.

CAP under Partition — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive CAP under Partition widget — step through the algorithm or protocol and observe the internal state updating in real time.

How it works

Every distributed write needs to reach a quorum of replicas to be considered durable. When the network splits the cluster into two groups that cannot see each other, one of two things must happen to each partition. A CP system (ZooKeeper, etcd, HBase) detects the partition and allows only the majority side to accept writes; the minority side returns errors. This preserves linearizability — every successful write is visible to every subsequent successful read — at the cost of rejecting requests routed to the minority. The detection mechanism is usually a lease or a Raft leader whose heartbeat times out. An AP system (Cassandra, DynamoDB, Riak) allows both sides to accept writes, tagging them with timestamps or vector clocks. When the partition heals, the system reconciles divergent writes using last-write-wins, merge functions, or CRDTs. Every request gets an answer, but two clients reading the same key during the partition may see different values, and reconciliation can drop writes silently if two updates collide. The PACELC extension captures the practical reality: in the absence of a partition (Else), systems also trade Latency for Consistency. A strongly consistent read across three regions pays three cross-region RTTs; an eventually consistent read pays one. Most production systems are tunable — Cassandra lets you pick QUORUM or ONE per request — so the CAP label is really a default, not a prison.

Implementation

CPStrategy: majority-only writes, reject on partition
public class CPStrategy implements KVStore {
    private final List<Replica> replicas;
    private final int quorum;
    private final PartitionDetector detector;

    public CPStrategy(List<Replica> replicas, PartitionDetector detector) {
        this.replicas = replicas;
        this.quorum = replicas.size() / 2 + 1;
        this.detector = detector;
    }

    @Override
    public void put(String key, byte[] value) {
        if (!detector.hasMajorityQuorum()) {
            throw new UnavailableException("Refusing write on minority partition");
        }
        int acks = 0;
        for (Replica r : replicas) {
            try { r.write(key, value); acks++; }
            catch (IOException ignored) {}
            if (acks >= quorum) return;
        }
        throw new UnavailableException("Quorum not reached, acks=" + acks);
    }

    @Override
    public byte[] get(String key) {
        if (!detector.hasMajorityQuorum())
            throw new UnavailableException("Refusing read on minority partition");
        return replicas.get(0).read(key); // leader-only read
    }
}
APStrategy: accept on any reachable replica, reconcile later
public class APStrategy implements KVStore {
    private final List<Replica> replicas;
    private final ConflictResolver resolver; // LWW or CRDT merge

    public APStrategy(List<Replica> replicas, ConflictResolver resolver) {
        this.replicas = replicas; this.resolver = resolver;
    }

    @Override
    public void put(String key, byte[] value) {
        long ts = System.currentTimeMillis();
        for (Replica r : replicas) {
            try { r.writeTimestamped(key, value, ts); return; }
            catch (IOException e) { /* try next */ }
        }
        throw new UnavailableException("No replica reachable");
    }

    @Override
    public byte[] get(String key) {
        List<TimestampedValue> seen = new ArrayList<>();
        for (Replica r : replicas) {
            try { seen.add(r.readTimestamped(key)); }
            catch (IOException ignored) {}
            if (!seen.isEmpty()) break; // serve first reachable
        }
        if (seen.isEmpty()) throw new UnavailableException("All replicas down");
        return resolver.merge(seen).value();
    }
}

Complexity

Key design decisions & trade-offs

Common pitfalls

Interview follow-ups

Recommended reading

Related