Back-of-envelope Estimator

By Rahul Kumar · Senior Software Engineer · Updated May 2026 · Category: System Design Primer · Unique Topics

DAU → QPS / storage / bandwidth with the math shown inline.

This interactive explanation is built for system design interview prep: step through Back-of-envelope Estimator, watch the internal state change, and connect the concept to real distributed-system trade-offs.

Overview

Back-of-envelope (BOE) estimation is the skill that separates a system design interview answer from a plausible system. Before you draw a single box, you need to know roughly how many requests per second will hit it, how much storage it will need a year from now, and how much network bandwidth it will push. BOE lets you catch the design-killing numbers early: a photo upload service at 1 TB per day is a very different system from 100 GB per day, and you only find out which one you are designing by multiplying a few numbers. The math is deliberately rough. You convert daily-active users into QPS by dividing by the seconds in a day (about 86,400, usually rounded to 100k for speed), you multiply record size by records per day for daily ingest, and you multiply daily ingest by retention days for total storage. The goal is an order-of-magnitude answer, not a forecast.

Back-of-envelope Estimator — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive Back-of-envelope Estimator widget — step through the algorithm or protocol and observe the internal state updating in real time.

How it works

Every BOE calculation follows the same three-step pattern. First, translate user activity into a rate. If you have 100 million DAU and each user performs 10 writes per day, that is 1 billion writes per day, which divided by roughly 100k seconds per day gives 10,000 writes per second average and typically 2-3x peak, so assume 30k QPS peak. Second, translate that rate into a storage footprint. Multiply request size by requests per day, then multiply by retention. A 1 KB tweet, 500 million per day, kept for 5 years is 500 GB per day and about 900 TB over the retention window. Third, translate rate and size into bandwidth. 30k QPS of 1 KB responses is 30 MB/s, which is a single machine's NIC; 30k QPS of 1 MB responses is 30 GB/s, which is a fleet. The useful constants to memorize are: seconds per day is about 100k, a year is about 30 million seconds, L1 cache is about 1 ns, main memory about 100 ns, SSD read about 100 us, cross-datacenter round-trip about 100 ms. Keep these in your head and you can sanity-check any design in under a minute.

Implementation

BOE estimator with QPS, storage, bandwidth helpers

public final class BackOfEnvelope {
    private static final long SECONDS_PER_DAY = 86_400L;
    private static final double PEAK_MULTIPLIER = 3.0; // peak vs average

    private BackOfEnvelope() {}

    /** Average QPS from DAU and actions-per-user-per-day. */
    public static double qps(long dau, long actionsPerDay) {
        if (dau <= 0 || actionsPerDay <= 0) return 0.0;
        return ((double) dau * actionsPerDay) / SECONDS_PER_DAY;
    }

    /** Peak QPS assuming a 3x peak-to-average ratio. */
    public static double peakQps(long dau, long actionsPerDay) {
        return qps(dau, actionsPerDay) * PEAK_MULTIPLIER;
    }

    /** Total storage in bytes for the retention window. */
    public static long storage(long bytesPerItem, long itemsPerDay, long retentionDays) {
        return bytesPerItem * itemsPerDay * retentionDays;
    }

    /** Bytes per second needed at average QPS. */
    public static double bandwidth(double qps, long bytesPerReq) {
        return qps * bytesPerReq;
    }

    /** Human-friendly string: auto-scale to KB/MB/GB/TB. */
    public static String humanBytes(double bytes) {
        String[] units = {"B", "KB", "MB", "GB", "TB", "PB"};
        int i = 0;
        while (bytes >= 1024 && i < units.length - 1) { bytes /= 1024; i++; }
        return String.format("%.1f %s", bytes, units[i]);
    }

    public static void main(String[] args) {
        long dau = 100_000_000L, writesPerDay = 10;
        double q = qps(dau, writesPerDay);
        long total = storage(1024, dau * writesPerDay, 5 * 365);
        System.out.println("avg qps=" + q + " peak=" + peakQps(dau, writesPerDay));
        System.out.println("5y storage=" + humanBytes(total));
    }
}

Complexity

seconds/day: ~100,000
year in seconds: ~30M
main memory read: ~100 ns
SSD read: ~100 us
cross-DC RTT: ~100 ms

Key design decisions & trade-offs

Rounding seconds/day — Chosen: Use 100k instead of 86,400. Trades 15% accuracy for mental math speed; BOE is an order-of-magnitude tool, not a forecast.
Peak-to-average multiplier — Chosen: Assume 2-3x peak over average. Most consumer traffic follows a diurnal pattern that peaks at roughly this ratio; safer to over-provision than under-estimate.
Bandwidth math — Chosen: Compute both ingress and egress separately. Read-heavy systems have 10-100x more egress than ingress; a single number hides the dominant cost.

Common pitfalls

Forgetting to multiply by replication factor for storage (3x for quorum systems)
Confusing bits per second with bytes per second on a NIC
Using DAU when the correct metric is concurrent users for connection counts
Ignoring read amplification: one logical read can be 3-10 physical reads in an LSM

Interview follow-ups

Model hot-key skew: a 1% skew can turn 10k QPS into 1k QPS per hot shard
Estimate cache hit rate impact on downstream QPS
Size working set vs total dataset and pick memory vs SSD tier accordingly
Account for cross-region replication bandwidth separately from client-facing bandwidth