How do you design Distributed Unique ID Generator (Snowflake) for a system design interview?

Start from requirements, estimate scale, draw the high-level architecture, explain the main data flow, identify bottlenecks, and compare trade-offs. This walkthrough covers Distributed Unique ID Generator (Snowflake) with components, operations, capacity estimates, and an interactive simulator.

What should I discuss in a Distributed Unique ID Generator (Snowflake) system design answer?

Cover functional requirements, non-functional requirements, APIs, data model, key services, storage choices, caching, queues or streams where relevant, failure handling, observability, and scaling trade-offs.

Distributed Unique ID Generator (Snowflake) System Design Interview Question

Q: Can I practice Distributed Unique ID Generator (Snowflake) interactively?

Yes. The page includes an interactive browser-based simulator for Distributed Unique ID Generator (Snowflake), including an architecture diagram, step-by-step request flow, component swap, and stress simulation.

By Rahul Kumar · Senior Software Engineer · Updated May 2026 · 6 components · 3 operations ·Source: Alex Xu, System Design Interview Vol 1, Chapter 7 (pages 110–118); Twitter Snowflake blog (2010)

Problem: Design a system that generates globally unique, roughly time-sortable 64-bit IDs at very high throughput across many machines, without a central bottleneck on the write path.

Overview

Every persistent write to a distributed database eventually asks the same question: what primary key should this row have? For a single-node app you just use an auto-increment column, but the moment you shard the database or run writes from multiple application replicas, central auto-increment becomes either a bottleneck or a correctness hazard. Twitter's answer in 2010 was Snowflake: a 64-bit ID that packs a millisecond timestamp, a machine identifier, and a per-millisecond sequence counter into a single long, generated entirely in-process on each application host with no network hop per ID. Discord, Instagram, and most modern microservice platforms use a close variant. The design gives you roughly time-sortable, densely packed, index-friendly primary keys at tens of millions of IDs per second cluster-wide, with p99 latency measured in microseconds. The tradeoffs - wall-clock sensitivity, finite machine-ID space, opaque to clients - are well understood, which is why Snowflake is the go-to answer in system-design interviews and real production alike.

Distributed Unique ID Generator (Snowflake) — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive walkthrough for Distributed Unique ID Generator (Snowflake) — animated architecture diagram, step-by-step flow with real payloads, component swap, and a discrete-event stress simulator.

Summary

A Twitter Snowflake-style ID generator: each application host embeds an in-process generator that packs {timestamp | datacenter_id | machine_id | sequence} into a 64-bit long. Machine IDs are allocated once at startup from ZooKeeper/etcd and then cached. The dominant design choice is in-process generation (no network hop per ID — sub-microsecond latency) with ZooKeeper only on the cold path; the main tradeoff is that wall-clock skew between hosts can cause duplicate IDs, which the generator must detect and refuse. Sized for >1M IDs/sec per host and effectively unlimited horizontal scale.

Requirements

Functional

Generate globally unique 64-bit numeric IDs across every host in every datacenter
IDs are roughly time-sortable (newer IDs compare greater than older ones)
Support at least 10K IDs/sec per host, with headroom to 1M/sec
Allocate machine IDs automatically at startup via a coordination service
Expose IDs through both an in-process SDK and a REST endpoint for non-Java callers
Detect and refuse to generate on clock-moved-backwards conditions rather than emit duplicates

Non-functional

Sub-millisecond p99 latency on the generation call
No single network hop on the hot path
Horizontally scalable with no central write bottleneck
Tolerate coordination-service outage after startup (machine ID is cached)
Operate for decades on a single 41-bit timestamp without overflow

Capacity Assumptions

Book Ch 7 requirements: IDs unique, numeric, fit in 64 bits, ordered by date, ≥ 10,000 IDs/sec
64-bit ID layout (book Figure 7-5): 1 sign bit | 41-bit ms timestamp | 5-bit datacenter_id | 5-bit machine_id | 12-bit sequence
41-bit ms timestamp → 2^41 - 1 ms ≈ 69 years since custom epoch (Twitter default: 1288834974657 = 2010-11-04)
2^5 = 32 datacenters × 2^5 = 32 machines per dc → 1024 machines total
2^12 = 4096 sequence values per ms per machine → 4.096M IDs/sec/machine ceiling
Target: any single app host generates up to ~1M IDs/sec sustained with p99 < 1ms

Back-of-Envelope Estimates

Aggregate throughput ceiling: 1024 * 4.096M ≈ 4.2B IDs/sec cluster-wide
Per-host steady-state: 100K-1M IDs/sec, almost entirely CPU-bound
ZooKeeper load: 1 read + 1 ephemeral znode write per host startup → ~10 ops/sec even with aggressive rolling deploys
Monitoring: 1 gauge per host (last_ts, sequence_exhaustion_count, clock_skew_ms) scraped every 15s

High-level architecture

The generator runs as a library inside each application server, so the 'request path' for an ID is just a method call. On process startup the host connects to ZooKeeper (or etcd), walks /snowflake/{datacenter} looking for a free machine-ID slot, and claims it via an ephemeral znode. That znode is the host's lease - if the process dies, ZooKeeper deletes it and the slot returns to the pool. The machine ID and datacenter ID are cached in memory and the coordinator is not touched again while the process runs. For each ID request, the generator reads the current wall-clock time in milliseconds, compares it to the last-seen timestamp, and takes one of three branches: if the clock moved forward, reset the sequence counter to zero and emit; if the clock is equal to the last timestamp, atomically increment the 12-bit sequence and spin-wait on overflow until the next millisecond; if the clock moved backwards by more than a small tolerance, throw ClockMovedBackwardsException and let the caller retry on a healthy host. The final long is assembled by left-shifting each field into place: the epoch-relative timestamp into the upper 41 bits, the datacenter into the next 5, the machine into the next 5, and the sequence into the low 12. A thin REST endpoint wraps next_id() for polyglot callers, but most services embed the generator directly to save the network hop and keep the latency budget.

Architecture Components (6)

Client (Service or User Request) (client) — Any upstream caller that needs a unique ID — a mobile app, another microservice, or a batch job.
Application Server (api) — Stateless service that serves business traffic and embeds the Snowflake generator as a library.
Snowflake Generator (in-process) (id-generator) — Pure-CPU generator that packs timestamp, machine ID, and a per-ms sequence into a 64-bit long.
Coordinator (ZooKeeper / etcd) (coordinator) — Allocates unique machine IDs at host startup via ephemeral znodes.
ID Usage DB (downstream consumer) (sql) — Representative downstream store that indexes by the generated ID.
Monitoring (Prometheus + Alerts) (worker) — Scrapes per-host generator metrics and alerts on clock skew or sequence exhaustion.

Operations Walked Through (3)

generate-id — Normal request-time ID generation — zero network hops beyond the app server, because the Snowflake generator runs in-process.
machine-startup — When an app host boots, it must atomically claim an unused machine_id from the coordinator before it can generate any IDs.
clock-skew — The wall clock moves backwards (NTP correction, VM pause, leap second). The generator MUST refuse to generate IDs until the clock catches up.

Implementation

SnowflakeIdGenerator.java

package com.hld.idgen;

/**
 * Twitter Snowflake layout:
 *   sign(1) | timestamp_ms(41) | datacenter(5) | machine(5) | sequence(12)
 */
public final class SnowflakeIdGenerator {
    private static final long EPOCH_MS = 1288834974657L; // 2010-11-04 Twitter custom epoch

    private static final int SEQUENCE_BITS = 12;
    private static final int MACHINE_BITS = 5;
    private static final int DATACENTER_BITS = 5;

    private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;
    private static final long MAX_MACHINE = (1L << MACHINE_BITS) - 1;
    private static final long MAX_DATACENTER = (1L << DATACENTER_BITS) - 1;

    private static final int MACHINE_SHIFT = SEQUENCE_BITS;
    private static final int DATACENTER_SHIFT = SEQUENCE_BITS + MACHINE_BITS;
    private static final int TIMESTAMP_SHIFT = SEQUENCE_BITS + MACHINE_BITS + DATACENTER_BITS;

    private final long datacenterId;
    private final long machineId;

    private long lastTimestamp = -1L;
    private long sequence = 0L;

    public SnowflakeIdGenerator(long datacenterId, long machineId) {
        if (datacenterId < 0 || datacenterId > MAX_DATACENTER) {
            throw new IllegalArgumentException("datacenterId out of range [0, " + MAX_DATACENTER + "]");
        }
        if (machineId < 0 || machineId > MAX_MACHINE) {
            throw new IllegalArgumentException("machineId out of range [0, " + MAX_MACHINE + "]");
        }
        this.datacenterId = datacenterId;
        this.machineId = machineId;
    }

    public synchronized long nextId() {
        long ts = System.currentTimeMillis();
        if (ts < lastTimestamp) {
            long drift = lastTimestamp - ts;
            throw new ClockMovedBackwardsException(drift);
        }
        if (ts == lastTimestamp) {
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0) {
                ts = spinUntilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        lastTimestamp = ts;

        return ((ts - EPOCH_MS) << TIMESTAMP_SHIFT)
                | (datacenterId << DATACENTER_SHIFT)
                | (machineId << MACHINE_SHIFT)
                | sequence;
    }

    private long spinUntilNextMillis(long reference) {
        long ts = System.currentTimeMillis();
        while (ts <= reference) {
            ts = System.currentTimeMillis();
        }
        return ts;
    }

    public static final class ClockMovedBackwardsException extends RuntimeException {
        public ClockMovedBackwardsException(long driftMs) {
            super("Clock moved backwards by " + driftMs + "ms; refusing to generate");
        }
    }
}

MachineIdAllocator.java

package com.hld.idgen;

import java.nio.charset.StandardCharsets;
import java.util.List;
import org.apache.curator.framework.CuratorFramework;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.KeeperException;

/** Claims a free 5-bit machine id under /snowflake/{dc}/ using an ephemeral znode. */
public final class MachineIdAllocator {
    private static final int MAX_MACHINE = 31; // 2^5 - 1

    private final CuratorFramework zk;
    private final int datacenterId;
    private final String hostName;

    public MachineIdAllocator(CuratorFramework zk, int datacenterId, String hostName) {
        this.zk = zk;
        this.datacenterId = datacenterId;
        this.hostName = hostName;
    }

    public int claim() throws Exception {
        String dcPath = "/snowflake/" + datacenterId;
        ensurePath(dcPath);
        List<String> taken = zk.getChildren().forPath(dcPath);
        boolean[] used = new boolean[MAX_MACHINE + 1];
        for (String c : taken) {
            try { used[Integer.parseInt(c)] = true; } catch (NumberFormatException ignored) { }
        }
        for (int candidate = 0; candidate <= MAX_MACHINE; candidate++) {
            if (used[candidate]) continue;
            String path = dcPath + "/" + candidate;
            try {
                zk.create()
                        .withMode(CreateMode.EPHEMERAL)
                        .forPath(path, hostName.getBytes(StandardCharsets.UTF_8));
                return candidate;
            } catch (KeeperException.NodeExistsException race) {
                // another host grabbed it between getChildren and create; keep scanning
            }
        }
        throw new IllegalStateException("No free machine id in dc " + datacenterId);
    }

    private void ensurePath(String path) throws Exception {
        if (zk.checkExists().forPath(path) == null) {
            zk.create().creatingParentsIfNeeded().forPath(path);
        }
    }
}

IdGeneratorController.java

package com.hld.idgen;

import java.util.List;
import java.util.stream.LongStream;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/v1/ids")
public class IdGeneratorController {

    private final SnowflakeIdGenerator generator;

    @Autowired
    public IdGeneratorController(SnowflakeIdGenerator generator) {
        this.generator = generator;
    }

    @PostMapping
    public ResponseEntity<IdResponse> nextId() {
        try {
            long id = generator.nextId();
            // Serialize as string to avoid 53-bit precision loss in JS clients
            return ResponseEntity.ok(new IdResponse(Long.toString(id), id));
        } catch (SnowflakeIdGenerator.ClockMovedBackwardsException e) {
            return ResponseEntity.status(503).header("Retry-After", "1").build();
        }
    }

    @GetMapping("/batch")
    public ResponseEntity<List<String>> batch(@RequestParam(defaultValue = "100") int count) {
        if (count <= 0 || count > 10_000) {
            return ResponseEntity.badRequest().build();
        }
        List<String> ids = LongStream.range(0, count)
                .mapToObj(i -> Long.toString(generator.nextId()))
                .toList();
        return ResponseEntity.ok(ids);
    }

    public record IdResponse(String id, long idNumeric) { }
}

SnowflakeBean.java

package com.hld.idgen;

import java.net.InetAddress;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.retry.ExponentialBackoffRetry;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class SnowflakeBean {

    @Bean(destroyMethod = "close")
    public CuratorFramework curator(@Value("${zk.connect}") String connect) {
        CuratorFramework cf = CuratorFrameworkFactory.newClient(connect,
                new ExponentialBackoffRetry(500, 5));
        cf.start();
        return cf;
    }

    @Bean
    public SnowflakeIdGenerator snowflakeGenerator(CuratorFramework zk,
                                                   @Value("${datacenter.id}") int datacenterId) throws Exception {
        String host = InetAddress.getLocalHost().getHostName();
        int machineId = new MachineIdAllocator(zk, datacenterId, host).claim();
        return new SnowflakeIdGenerator(datacenterId, machineId);
    }
}

Key design decisions & trade-offs

Placement of the generator — Chosen: In-process library, not a dedicated service. Removes a network hop worth ~0.5ms per ID; at 1M IDs/sec that hop alone would blow the entire latency budget.
Machine ID assignment — Chosen: ZooKeeper ephemeral znodes on startup. Centralized allocation prevents duplicate IDs across hosts; ephemeral means dead hosts auto-release the slot without operator action.
Clock drift handling — Chosen: Refuse to generate (throw) on backwards clock movement. Emitting duplicate IDs silently corrupts downstream DB uniqueness; failing loudly lets a load balancer eject the bad host.
Bit layout (41/5/5/12) — Chosen: Keep Twitter's default despite being tunable. Gives 69 years of timestamps, 1024 machines, and 4096 IDs/ms/machine - enough for most businesses forever; tuning it breaks any downstream code that parses bits.
Wire format — Chosen: Serialize IDs as strings in JSON. JavaScript's Number.MAX_SAFE_INTEGER is 2^53-1, so 64-bit IDs lose precision on JSON.parse; Twitter hit this exact bug and added id_str.

Interview follow-ups

K-sorted guarantee across datacenters for analytics range scans
Reserved prefix bits per tenant for a multi-tenant SaaS sharding scheme
Fallback to a separate non-monotonic generator when clock drift is detected
Metrics on sequence-exhaustion rate to know when to shard hot hosts
Opaque base62 encoding for public API surfaces so bit layout can evolve