← System Design Simulator

Stock Exchange System Design Interview Question

By Rahul Kumar · Senior Software Engineer · Updated · 9 components · 4 operations ·Source: Alex Xu, System Design Interview Vol 2, Chapter 13

Problem: Design a stock exchange with a matching engine that processes limit orders, a market-data publisher that broadcasts book updates at 100K ticks/s, and a batched clearing/settlement pipeline.

Overview

A stock exchange is the hardest latency-sensitive system in common system-design interviews. It is not a CRUD app: the matching engine is a single logical process that must accept millions of orders per second, match them against the book in strict price-time priority, emit fills atomically, and publish a fan-out stream of book deltas to every connected client within microseconds. The interesting tension is that matching is inherently sequential — you cannot parallelize 'match the next incoming order against the best bid' without introducing reordering that changes who gets filled — so the system scales by partitioning on symbol rather than by sharding the book itself. Correctness is as brutal as in a payment system, because a lost fill or a double fill creates positions that don't exist. Latency is simultaneously brutal because HFT firms colocate in the same rack and measure competitive advantage in nanoseconds. The design below sketches a single-symbol matching engine, the price-time priority book that feeds it, and the market-data fan-out that takes the output to the world.

Stock Exchange — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive walkthrough for Stock Exchange — animated architecture diagram, step-by-step flow with real payloads, component swap, and a discrete-event stress simulator.

Summary

A low-latency exchange built around a SINGLE-THREADED, in-memory matching engine per symbol. The dominant design choice is determinism over raw throughput: each symbol gets one thread owning one in-memory order book, which makes the match order provably reproducible (critical for audit and dispute resolution) and eliminates lock contention on the hot path. Sharding is by symbol — not thread-per-order — so the exchange scales by adding more symbols to more engines. The market-data publisher fans out book updates via UDP multicast (exchange-native) or Kafka (cloud-native) to thousands of subscribers. Clearing and settlement are intentionally BATCHED (T+1 or T+2) so the hot path stays fast while back-office systems handle the heavyweight legal transfer of ownership.

Requirements

Functional

Non-functional

Capacity Assumptions

Back-of-Envelope Estimates

High-level architecture

The exchange is partitioned by symbol — AAPL and TSLA run on different matching engines, each single-threaded on a pinned CPU core. Orders arrive at gateway servers over binary FIX or a proprietary protocol; the gateway authenticates, checks pre-trade risk (exposure, credit, fat-finger limits), and forwards to a Sequencer. The Sequencer is the single serialization point: it assigns a monotonic seq_no to every order and appends it to a replicated log (Aeron, LBM, or a custom multicast log) before returning. The matching engine for a symbol consumes its partition of the sequenced log, applies each order to an in-memory OrderBook, and emits a stream of events: OrderAccepted, Trade, OrderCanceled, BookDelta. A hot-standby replica consumes the same sequenced log and maintains an identical book; if the primary fails, the standby takes over at the last acknowledged seq_no with no replay ambiguity. Market-data publishers consume the event stream and fan it out over multicast to subscribers — top-of-book goes on a Level-1 feed, full depth changes on a Level-2 feed. Persistence is not on the matching-engine critical path: the sequenced log is the durable record, and end-of-day the engine snapshots the book to disk for compliance and restart. Clock discipline matters — all gateways, sequencers, and engines run PTP-synced within microseconds so regulatory timestamps are trustworthy. Capacity is scaled by adding more symbols per sequencer shard, not by splitting a single symbol's book, because price-time priority across a split book is ill-defined.

Architecture Components (9)

Operations Walked Through (4)

Implementation

OrderBook — price-time priority, bid/ask sides
public final class OrderBook {
  // Buy side: highest price first. Sell side: lowest price first.
  private final TreeMap<Long, PriceLevel> bids = new TreeMap<>(Comparator.reverseOrder());
  private final TreeMap<Long, PriceLevel> asks = new TreeMap<>();
  private final Long2ObjectOpenHashMap<Order> byId = new Long2ObjectOpenHashMap<>();

  public void add(Order o) {
    TreeMap<Long, PriceLevel> side = o.isBuy() ? bids : asks;
    PriceLevel lvl = side.computeIfAbsent(o.priceTicks(), PriceLevel::new);
    lvl.append(o); // FIFO within a price level = time priority
    byId.put(o.id(), o);
  }

  public Order cancel(long orderId) {
    Order o = byId.remove(orderId);
    if (o == null) return null;
    TreeMap<Long, PriceLevel> side = o.isBuy() ? bids : asks;
    PriceLevel lvl = side.get(o.priceTicks());
    lvl.remove(o);
    if (lvl.isEmpty()) side.remove(o.priceTicks());
    return o;
  }

  public Long bestBid() { return bids.isEmpty() ? null : bids.firstKey(); }
  public Long bestAsk() { return asks.isEmpty() ? null : asks.firstKey(); }

  TreeMap<Long, PriceLevel> oppositeSide(boolean isBuy) { return isBuy ? asks : bids; }
}
MatchingEngine — match incoming limit orders
public final class MatchingEngine {
  private final OrderBook book;
  private final EventSink out;

  /** Applies one sequenced order. Single-threaded: caller guarantees ordering. */
  public void onOrder(Order incoming) {
    out.accept(new OrderAccepted(incoming.id(), incoming.seqNo()));
    TreeMap<Long, PriceLevel> opp = book.oppositeSide(incoming.isBuy());
    long remaining = incoming.qty();

    Iterator<Map.Entry<Long, PriceLevel>> it = opp.entrySet().iterator();
    while (remaining > 0 && it.hasNext()) {
      Map.Entry<Long, PriceLevel> e = it.next();
      long bestPrice = e.getKey();
      if (!crosses(incoming, bestPrice)) break; // price no longer crosses
      PriceLevel lvl = e.getValue();
      while (remaining > 0 && !lvl.isEmpty()) {
        Order resting = lvl.head();
        long fillQty = Math.min(remaining, resting.qty());
        out.accept(new Trade(incoming.id(), resting.id(), bestPrice, fillQty, System.nanoTime()));
        resting.reduce(fillQty);
        remaining -= fillQty;
        if (resting.qty() == 0) { lvl.removeHead(); book.forget(resting.id()); }
      }
      if (lvl.isEmpty()) it.remove();
    }
    if (remaining > 0) {
      incoming.reduceTo(remaining);
      book.add(incoming); // rest the unmatched remainder
      out.accept(new BookDelta(incoming.isBuy(), incoming.priceTicks(), +remaining));
    }
  }

  private static boolean crosses(Order taker, long restingPrice) {
    return taker.isBuy() ? taker.priceTicks() >= restingPrice
                         : taker.priceTicks() <= restingPrice;
  }
}
MarketDataPublisher — broadcast book deltas
public final class MarketDataPublisher implements EventSink {
  private final MulticastChannel l1; // top-of-book
  private final MulticastChannel l2; // full depth
  private final ByteBuffer scratch = ByteBuffer.allocateDirect(256);

  private long lastBestBid = Long.MIN_VALUE;
  private long lastBestAsk = Long.MAX_VALUE;

  @Override
  public void accept(ExchangeEvent evt) {
    switch (evt) {
      case Trade t -> {
        emit(l1, encodeTrade(t));
        emit(l2, encodeTrade(t));
      }
      case BookDelta d -> {
        emit(l2, encodeDelta(d));
        maybeEmitTopOfBook(d);
      }
      case OrderAccepted ignored -> {} // not market data
      default -> {}
    }
  }

  private void maybeEmitTopOfBook(BookDelta d) {
    long bb = d.bestBidAfter();
    long ba = d.bestAskAfter();
    if (bb != lastBestBid || ba != lastBestAsk) {
      lastBestBid = bb;
      lastBestAsk = ba;
      emit(l1, encodeTopOfBook(bb, ba, d.timestampNs()));
    }
  }

  private void emit(MulticastChannel ch, ByteBuffer frame) {
    try { ch.send(frame, ch.group()); } catch (IOException e) { throw new UncheckedIOException(e); }
  }
}

Key design decisions & trade-offs

Interview follow-ups

Related