← System Design Simulator

Partitioning Strategies

By Rahul Kumar · Senior Software Engineer · Updated · Category: Kleppmann · Designing Data-Intensive Applications

Hash vs range, hotspots, rebalancing. Ch 6.

This interactive explanation is built for system design interview prep: step through Partitioning Strategies, watch the internal state change, and connect the concept to real distributed-system trade-offs.

Overview

Partitioning is how a distributed database turns one big dataset into many small ones that fit on individual machines, and the choice of partitioning strategy determines whether the system scales gracefully or spends its life fighting hotspots. The two fundamental approaches are hash partitioning, which destroys key locality in exchange for uniform load, and range partitioning, which preserves ordering at the cost of keys clustering onto a few partitions. A third option, consistent hashing, sits between them: it gives hash-style uniformity but minimizes reshuffling when nodes are added or removed. Real systems rarely pick one cleanly — Cassandra ranges hashed tokens; DynamoDB hashes the partition key but range-orders the sort key within it; Postgres lets you declare either. The question the designer must answer before writing any code is: what is the dominant read pattern? Point lookups want hash; range scans want range; time-series want compound partitioning. Pick wrong and you are rebalancing forever.

Partitioning Strategies — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive Partitioning Strategies widget — step through the algorithm or protocol and observe the internal state updating in real time.

How it works

Hash partitioning applies a function like MurmurHash or SHA-1 to the partition key, then uses modulo or ring position to assign the key to a partition. The hash spreads keys uniformly regardless of distribution in the input, eliminating the celebrity-user hotspot problem, but it destroys order: a scan over adjacent keys now touches every partition. Range partitioning keeps keys sorted and assigns contiguous ranges to partitions, so scans are cheap and ordered, but any skew in the key distribution (timestamps, alphabetical usernames, auto-increment IDs) produces catastrophic hotspotting on the latest partition. Consistent hashing builds a ring of hash values, places each node at several "virtual" positions on the ring, and assigns each key to the first node clockwise from its hash. Adding a node only moves the keys that fall between the new node's positions and its predecessors', keeping rebalance work proportional to 1/N rather than N/N. Compound strategies layer both: hash the user ID to pick a partition, then range-sort within by timestamp to make "user's recent messages" a single-partition ordered scan. Rebalancing is where partitioning gets hardest — naive mod-N hashing means adding one node moves every key; fixed-partition schemes (10x more partitions than nodes) let you move whole partitions instead of individual keys; dynamic partitioning splits hot ranges on the fly.

Implementation

HashPartitioner: uniform distribution, no locality
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

/** Hash a key and mod into a fixed number of partitions. */
public final class HashPartitioner<K> {
    private final int partitionCount;

    public HashPartitioner(int partitionCount) {
        if (partitionCount <= 0) throw new IllegalArgumentException("partitionCount > 0");
        this.partitionCount = partitionCount;
    }

    public int assignPartition(K key) {
        int h = stableHash(key.toString());
        return Math.floorMod(h, partitionCount);
    }

    /** Use a stable cross-JVM hash; Object.hashCode is not stable across versions. */
    private static int stableHash(String s) {
        try {
            MessageDigest md = MessageDigest.getInstance("MD5");
            byte[] d = md.digest(s.getBytes(StandardCharsets.UTF_8));
            return ((d[0] & 0xff) << 24) | ((d[1] & 0xff) << 16)
                 | ((d[2] & 0xff) << 8)  |  (d[3] & 0xff);
        } catch (NoSuchAlgorithmException e) {
            throw new IllegalStateException(e);
        }
    }

    public int partitionCount() { return partitionCount; }
}
RangePartitioner: ordered, scan-friendly, hotspot-prone
import java.util.*;

/** Assign keys to partitions based on sorted boundary splits. */
public final class RangePartitioner<K extends Comparable<K>> {
    private final List<K> splitPoints;   // sorted; partition i covers [splits[i-1], splits[i])

    public RangePartitioner(List<K> splitPoints) {
        List<K> copy = new ArrayList<>(splitPoints);
        Collections.sort(copy);
        for (int i = 1; i < copy.size(); i++) {
            if (copy.get(i - 1).compareTo(copy.get(i)) == 0)
                throw new IllegalArgumentException("duplicate split point");
        }
        this.splitPoints = List.copyOf(copy);
    }

    /** Partition id in [0, splitPoints.size()]. */
    public int assignPartition(K key) {
        int idx = Collections.binarySearch(splitPoints, key);
        // binarySearch returns -(insertion_point) - 1 when key is not found.
        return idx >= 0 ? idx + 1 : -idx - 1;
    }

    /** Scan across a key range — only the covering partitions need be queried. */
    public List<Integer> partitionsForRange(K fromInclusive, K toExclusive) {
        int lo = assignPartition(fromInclusive);
        int hi = assignPartition(toExclusive);
        List<Integer> out = new ArrayList<>();
        for (int i = lo; i <= hi; i++) out.add(i);
        return out;
    }

    public int partitionCount() { return splitPoints.size() + 1; }
}

Complexity

Key design decisions & trade-offs

Common pitfalls

Interview follow-ups

Recommended reading

Related