← System Design Simulator

Two-Phase Commit

By Rahul Kumar · Senior Software Engineer · Updated · Category: Petrov · Distributed Systems (Part II)

Prepare/commit + coordinator crash blocking. Ch 13.

This interactive explanation is built for system design interview prep: step through Two-Phase Commit, watch the internal state change, and connect the concept to real distributed-system trade-offs.

Overview

Two-phase commit (2PC) is the classical protocol for atomically committing a transaction across multiple independent participants: every participant either commits or every participant aborts, even though they sit behind their own locks and logs. A coordinator drives the transaction through two rounds. In Phase 1 it asks every participant "can you commit?" and each participant either votes YES (and durably promises it will commit if asked) or votes NO. In Phase 2 the coordinator tallies the votes and broadcasts the decision: COMMIT if every vote was YES, ABORT otherwise. The protocol is the default in XA, classical distributed databases, and many microservice saga implementations, and it is the honest answer whenever you need cross-service atomicity. It is also famously the protocol that blocks forever if the coordinator dies at the worst possible moment, which is why it is the gateway drug to three-phase commit, Paxos Commit, and Raft-backed transaction managers.

Two-Phase Commit — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive Two-Phase Commit widget — step through the algorithm or protocol and observe the internal state updating in real time.

How it works

A transaction arrives at the coordinator with a set of participants already enlisted (databases, queues, remote services). The coordinator assigns a transaction id, durably logs BEGIN, and enters Phase 1 by sending PREPARE to every participant. A participant that receives PREPARE executes the transaction locally up to the point of commit, acquires all necessary locks, writes an undo/redo record to its WAL, forces it to disk, and only then replies YES; this is the contract that makes its vote binding. A participant that cannot commit (constraint violation, resource exhaustion) writes ABORT locally and replies NO. The coordinator waits for all votes. If any vote is NO (or any participant times out), it logs ABORT and sends ABORT to everyone. If every vote is YES, it logs COMMIT - this log record is the point of no return - and sends COMMIT to everyone. Participants apply the durable outcome, release their locks, and ack. The coordinator clears the transaction log when all acks return. The protocol is blocking because a participant that has voted YES and then lost contact with the coordinator cannot unilaterally decide: it must keep locks held and wait for the coordinator to recover and tell it the fate. In practice, production 2PC pairs the coordinator with heartbeats, a replicated coordinator log, and a participant-to-participant "what did you hear?" recovery path to limit how long a crashed coordinator can freeze the cluster.

Implementation

Coordinator driving PREPARE and COMMIT rounds
public class Coordinator {
    private final List<ParticipantClient> participants;
    private final TransactionLog log;
    private static final long VOTE_TIMEOUT_MS = 2_000;

    public Coordinator(List<ParticipantClient> participants, TransactionLog log) {
        this.participants = participants; this.log = log;
    }

    public boolean commit(String txId) {
        log.append(txId, "BEGIN");
        // Phase 1: PREPARE
        List<CompletableFuture<Boolean>> votes = new ArrayList<>();
        for (ParticipantClient p : participants) {
            votes.add(CompletableFuture.supplyAsync(() -> p.prepare(txId))
                    .orTimeout(VOTE_TIMEOUT_MS, TimeUnit.MILLISECONDS)
                    .exceptionally(e -> false));
        }
        boolean allYes = votes.stream().map(CompletableFuture::join)
                .reduce(true, Boolean::logicalAnd);

        // Phase 2: decide and broadcast
        if (allYes) {
            log.append(txId, "COMMIT"); // point of no return
            for (ParticipantClient p : participants) p.commit(txId);
            log.append(txId, "DONE");
            return true;
        } else {
            log.append(txId, "ABORT");
            for (ParticipantClient p : participants) p.abort(txId);
            log.append(txId, "DONE");
            return false;
        }
    }

    /** Called on coordinator restart to finish any in-flight transactions. */
    public void recover() {
        for (TransactionLog.Record r : log.unfinished()) {
            String decision = r.lastEntry();
            for (ParticipantClient p : participants) {
                if ("COMMIT".equals(decision)) p.commit(r.txId());
                else p.abort(r.txId());
            }
            log.append(r.txId(), "DONE");
        }
    }
}
Participant with onPrepare / onCommit / onAbort
public class Participant {
    public enum State { IDLE, PREPARED, COMMITTED, ABORTED }
    private final ParticipantLog log;
    private final LockManager locks;
    private final TxExecutor executor;
    private final Map<String, State> txStates = new ConcurrentHashMap<>();

    public Participant(ParticipantLog log, LockManager locks, TxExecutor executor) {
        this.log = log; this.locks = locks; this.executor = executor;
    }

    public synchronized boolean onPrepare(String txId, List<Op> ops) {
        if (txStates.containsKey(txId)) return txStates.get(txId) == State.PREPARED;
        try {
            locks.acquireAll(txId, ops);
            executor.stage(txId, ops);      // materialise undo/redo records
            log.forceSync(txId, "PREPARED"); // durable before we vote YES
            txStates.put(txId, State.PREPARED);
            return true;
        } catch (Exception e) {
            locks.releaseAll(txId);
            log.append(txId, "ABORTED");
            txStates.put(txId, State.ABORTED);
            return false;
        }
    }

    public synchronized void onCommit(String txId) {
        if (txStates.get(txId) != State.PREPARED) return; // idempotent reapply
        log.forceSync(txId, "COMMIT");
        executor.applyStaged(txId);
        locks.releaseAll(txId);
        txStates.put(txId, State.COMMITTED);
    }

    public synchronized void onAbort(String txId) {
        log.append(txId, "ABORT");
        executor.discardStaged(txId);
        locks.releaseAll(txId);
        txStates.put(txId, State.ABORTED);
    }

    public State status(String txId) {
        return txStates.getOrDefault(txId, State.IDLE);
    }
}

Complexity

Key design decisions & trade-offs

Common pitfalls

Interview follow-ups

Recommended reading

Related